[BioC] Minimum expression threshold?

Bob mudphud666 at yahoo.com
Tue Sep 2 06:34:24 MEST 2003


Thanks for your reply.
 
I'm still a bit puzzled over the range of expression values returned from RMA.
 
Using the following:
 
my.affy<- ReadAffy()
my.eset <- rma(my.affy)
summary(exprs(my.eset[,1]))

Returns:
 
 Min.   : 3.083                
 1st Qu.: 5.711                
 Median : 6.973                
 Mean   : 7.144                
 3rd Qu.: 8.491                
 Max.   :13.848  
 
This shows the minimum expression value is 3.083 - but the tissue I'm using cannot be expressing all of these genes (I'm hybridizing to the U133A chip).
 
So, I guess there are two main questions:
* Should RMA only be used for comparative studies?  What if someone wanted to create a database of all genes expressed in tissue X? (not that I'm doing this, but what if?)  What I'd like to do is filter the gene list so I can cut down on the number of tests in the multiple testing routine (and hence get better numbers).
 
* What exactly is the expression value that RMA returns?  I know it is log2 transformed, but I don't understand what it corresponds to.
 
Sorry if these questions are answered somewhere - I've looked but maybe not looked well enough. 
 
Thanks in advance.

"Rafael A. Irizarry" <ririzarr at jhsph.edu> wrote:
hi! i don't know of any good references. in practice i don't like to 
arbitrarily decide on a such cut-offs. this could be very problematic if 
you use MAS 5.0, but with other expression measures such as pm only 
li wong and rma you usually don't need this filtering step. 

sorry i cant be of more help,
rafael


On Tue, 26 Aug 2003, Bob wrote:

> Hello,
> I have started using bioconductor (which is great, by
> the way), and I have a question regarding how to
> choose a minimum expression threshold. 
> 
> I have read in the Affymetrix cel files, calculated
> expression using rma(), and now have a data frame with
> ~22k expression values across 14 samples (using the
> U133A chip). There are expression values for each
> Affy spot - although it is probably not true that this
> tissue expresses all 22k genes. My question is how do
> I choose a threshold above which I consider the gene
> to be expressed?
> 
> In addition (please correct me if I'm wrong), using
> only the number of expressed genes (or at least not
> all of the spots) will make for better values using
> the multtest package.
> 
> Can someone point me in the right direction, or point
> out some good references on this topic?
> 
> Thanks.
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
> 


---------------------------------


	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list