[BioC] genefilter

Martin Morgan mtmorgan at fhcrc.org
Tue Jul 29 13:19:27 CEST 2014


On 07/29/2014 04:16 AM, carol white wrote:
> For a positive quantity like variance, absolute value is not relevant. How about
> if a t-test value is used which could be positive or negative? the largest
> t-test value in absolute value corresponds to the lowest t-test p-val. Is the
> largest t-test value in absolue value taken by findLargest and does it
> correspond to the largest variation?

findLargest finds the largest value; if you want the largest absolute value then 
pass abs(<your statistic here>).

You can see that findLargest isn't doing anything too fancy

 > tail(findLargest, 3)

14     tSsp = split.default(testStat, lls)
15     sapply(tSsp, function(x) names(which.max(x)))
16 }

>
> Many thanks
>
>
> On Tuesday, July 29, 2014 12:30 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>
>
> On 07/28/2014 03:48 AM, carol white wrote:
>  > Hi,
>  > Does findLargest find the largest test statistics in absolute value which
>  > corresponds to the probe with the largest variation?
>
> Here's an example
>
>  > library(genefilter); library(Biobase)
>  > data(sample.ExpressionSet)
>  > eset = sample.ExpressionSet
>  > x = findLargest(featureNames(eset), rowVars(exprs(eset)), annotation(eset))
>
> the first argument is a character vector of feature (probeset) names. The second
> argument is the variance of the probeset across all samples.
>
> The result is
>  > head(x)
>          10002    100128124    100129271        10018    100287387    100505879
> "31667_r_at"  "31663_at"  "31326_at" "31611_s_at"  "31657_at"  "31648_at"
>
> which indicates that for Entrez gene 10002 the probeset with largest variance is
> 31667_r_at, etc.
>
> This could be used to subset the eset
>
>  > eset[x,]
>
> if you were interested in retaining just the probes with largest variance.
>
> If the statistic rowVars(exprs(eset)) were replaced with something else (e.g.,
> interquartile range), then findLargest would of course not return the probe with
> largest variance, but with the largest whatever that statistic was.
>
> Martin
>
>
>
>  >
>  > Thanks
>  >
>  > carol
>
>  >
>  > [[alternative HTML version deleted]]
>  >
>  > _______________________________________________ Bioconductor mailing list
>  > Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>  > https://stat.ethz.ch/mailman/listinfo/bioconductor
> <https://stat.ethz.ch/mailman/listinfo/bioconductor>Search the archives:
>  > http://news.gmane.org/gmane.science.biology.informatics.conductor
>  >
>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>
>
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list