[BioC] genefilter kOverA filter by range

James W. MacDonald jmacdon at uw.edu
Thu Jan 23 22:38:12 CET 2014


Hi Kristina,

On 1/23/2014 4:25 PM, Kristina M Fontanez wrote:
> Dear Bioconductors,
>
> I am trying to use the genefilter package to filter a set of Log2fold changes so that I can keep those taxa with Log2fold changes > 3. However, the data itself consists of both positive and negative values, as is the case with log 2 fold comparisons.

You don't need the genefilter package to do this, and in fact genefilter 
is intended for a completely different task.

Instead you just need to use simple R commands.

filt <- rowSums(abs(comp) > 3) > 1
comp[filt,]

Best,

Jim



>
> Example data:
> OTU Table:          [5 taxa and 3 samples]
>                       taxa are rows
>          LvS DvS LvD
> OTU1206    10.3     1.3     9.0
> OTU1203     8.3     2.7     5.5
> OTU1297     6.8    -0.9     7.7
> OTU1338     6.2    -1.4     7.7
> OTU1144     7.4     1.6     5.8
>
> I want to create a filter so that the OTUs with Log2 fold changes > magnitude 3 in either the positive or negative direction are kept. However, the documentation for kOverA in the genefilter package implies that you can only input “values you want to exceed”. As the code below is currently written, I am only keeping taxa with a log2 fold change > +3 in any one sample. However, taxa with a log2 fold change of -7 in a particular sample would be left out. I tested whether I was missing any OTUs by looking for the minimum value in the original OTU table (comp) and in the filtered OTU table (LFC3). As you can see the minimum -7.4 log2 fold change value in comp does not exist in the LFC3 object so it was excluded by my flist2 filter.
>
> Is there a similar function like kOverA that I can use to get large magnitude changes in both the positive and negative directions?
>
> I tried the code:
>> comp
> phyloseq-class experiment-level object
> otu_table()   OTU Table:         [ 2151 taxa and 3 samples ]
> tax_table()   Taxonomy Table:    [ 2151 taxa by 6 taxonomic ranks ]
>
>> flist2<-filterfun(kOverA(1,3.0))
>> LFC3=filter_taxa(comp,flist2,TRUE)
>> LFC3
> phyloseq-class experiment-level object
> otu_table()   OTU Table:         [ 164 taxa and 3 samples ]
> tax_table()   Taxonomy Table:    [ 164 taxa by 6 taxonomic ranks ]
>> min(otu_table(comp))
> [1] -7.4
>> min(otu_table(LFC3))
> [1] -5.5
>
> Thank you,
> Kristina
>
>> sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] genefilter_1.44.0 ggplot2_0.9.3.1   scales_0.2.3      phyloseq_1.7.12
>
> loaded via a namespace (and not attached):
>   [1] ade4_1.6-2            annotate_1.40.0       AnnotationDbi_1.24.0
>   [4] ape_3.0-11            Biobase_2.22.0        BiocGenerics_0.8.0
>   [7] biom_0.3.11           Biostrings_2.30.1     cluster_1.14.4
> [10] codetools_0.2-8       colorspace_1.2-4      DBI_0.2-7
> [13] DESeq2_1.2.8          dichromat_2.0-0       digest_0.6.4
> [16] foreach_1.4.1         GenomicRanges_1.14.4  grid_3.0.2
> [19] gtable_0.1.2          igraph_0.6.6          IRanges_1.20.6
> [22] iterators_1.0.6       labeling_0.2          lattice_0.20-24
> [25] locfit_1.5-9.1        MASS_7.3-29           Matrix_1.1-1.1
> [28] multtest_2.18.0       munsell_0.4.2         nlme_3.1-113
> [31] parallel_3.0.2        permute_0.8-0         plyr_1.8
> [34] proto_0.3-10          RColorBrewer_1.0-5    Rcpp_0.10.6
> [37] RcppArmadillo_0.4.000 reshape2_1.2.2        RJSONIO_1.0-3
> [40] RSQLite_0.11.4        splines_3.0.2         stats4_3.0.2
> [43] stringr_0.6.2         survival_2.37-4       tools_3.0.2
> [46] vegan_2.0-10          XML_3.95-0.2          xtable_1.7-1
> [49] XVector_0.2.0
>
> ------------------------------------------------------------------
> Kristina Fontanez, Postdoctoral Fellow
> fontanez at mit.edu<mailto:fontanez at mit.edu>
> Massachusetts Institute of Technology
> Department of Civil and Environmental Engineering
> 48-120E
> 15 Vassar Street
> Cambridge, MA 02139
>
>
>
>
> 	[[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list