[BioC] Gene filtering

Adaikalavan Ramasamy ramasamy at cancer.org.uk
Sat Feb 12 01:47:31 CET 2005


I never used genefilter and filterfun so I would not be able to advice
on this and hope the suggestions below solves your problem.

On a personal note, I just calculate and store the p-values/statistics
directly. This is more efficient as

* I can generate various lists of "differentially expressed" genes at
different p-value cutoffs. This is often required by the biologists who
might want a small and confident subset for biological validation and
maybe a bigger subset for computation validation (e.g. pathway analysis)

* Rank genes by p-values

* Adjust p-values for multiple hypothesis

Here is one way how you can do this

mat <- matrix( rnorm(100000), nc=100 ) 
rownames(

g <- rep(1:2, each=50)                 # e.g. 50 normal and 50 tumour

stats <- t( apply( mat, 1, function(z) { 
                x <- z[ which( g==1 ) ]
                y <- z[ which( g==2 ) ]

                t.p <- t.test(x, y)$p.value
                w.p <- wilcox.test(x, y)$p.value
                fc  <- mean(x, na.rm=T) - mean(y, na.rm=T)
                return( c(t.pval=t.p, wilcox.pval=w.p, fold.change=fc) )
               }))









On Fri, 2005-02-11 at 10:08 -0500, James W. MacDonald wrote:
> Heike Pospisil wrote:
> > Hello Adaikalavan
> > 
> >> I think justRMA() uses nearly all the memory you have access to, so it
> >> it only able to handle small computations afterwards. What I would
> >> suggest is try saving the exprSet and exit. Then start from a fresh R
> >> session and do your analysis from that. See below.
> >>  
> >>
> > 
> > Thanks for your suggestion. Saving and loading the exprSet work and 
> > help. But, unfortunately, my filter function do not work.
> > 
> > ff1<-ttest(data,.001,na.rm=TRUE)
> > ff2<-filterfun(ff1)
> > wh2<-genefilter(exprs(data), ff2)
> > 
> > No idea :-(
> > 
> > Best wishes.
> > Heike
> > 
> I think you are setting up ff1 incorrectly. As an example, let's say 
> that your exprSet contains 10 samples, the first 5 are e.g., 
> experimental, and the second 5 are control. Then you would set up ff1 
> like this:
> 
> ff1 <- ttest(5, 0.001, na.rm = TRUE)
> 
> -or-
> 
> cl <- c(rep(1,5), rep(2,5))
> ff1 <- ttest(cl, 0.001, na.rm = TRUE)
> 
> The second method can be used if the samples are not contiguous (e.g., 
> they are ordered exp, cont, exp, cont, etc).
> 
> cl <- c(rep(c(1,2), 5)
> ff1 <- ttest(cl, 0.001, na.rm = TRUE)
> 
> HTH,
> 
> Jim
> 
> 
>



More information about the Bioconductor mailing list