[BioC] Filtering is not recommended with LIMMA?

Gordon K Smyth smyth at wehi.EDU.AU
Thu May 23 01:37:46 CEST 2013


Dear Miriam,

I don't know what I/NI filtering is and it isn't really my job to make a 
running commentary on every filtering method that gets published.

However the limma algorithm analyses the spread of the genewise variances. 
Any filtering method based on genewise variances will change the 
distribution of variances, will interfere with the limma algorithm and 
hence will give poor results.

Like most people, I recommend filtering out genes that don't appear to be 
expressed in any sample.  See for example Case studies 15.3 or 15.4 in the 
limma User's Guide.

However you will find if you use eBayes(fit,trend=TRUE) instead of the 
usual eBayes(fit) that limma gives pretty good results regardless how much 
filtering you do, provided of course that the filtering is on expression 
and not on variance.

The literature tends to say that the reason for filtering is to reduce the 
amount of multiple testing, but in truth the increase in power from this 
is only slight.  The more important reason for filtering in most 
applications is to remove highly variable genes at low intensities.  The 
importance of filtering is highly dependent on how you pre-processed your 
data.  Filtering is less important if you (i) use a good background 
correction or normalising method that damps down variability at low 
intensities and (ii) use eBayes(trend=TRUE) which accommodates a 
mean-variance trend.

Best wishes
Gordon


> On 21 May 2013, at 03:06, "Garcia Orellana,Miriam" <mgarciao at ufl.edu> wrote:
>
>> Dear Dr. Smyth.
>>
>> Would you be that kind to help me on deciding whether yes or no to 
>> filter my microarray data set with a filtering method correcting for 
>> variance such as I/NI method from Talloen et al. (2007). Whereas many 
>> researchers say that filtering should increase the power of the test, 
>> then increasing the chance to get true deferentially expressed genes. 
>> However when I analyzed my data set. I found the next: (meaning lower 
>> number of DEG when filtering).
>>
>>
>> Ortoghonal contrasts                # of genes
>> (adjustedP >0.05 and   FC >1.4)
>> w/o filtering   I/NI filtering
>> FAT        195    118
>> FA         329    151
>> MR         169    103
>> FAT by MR          854    321
>> FA by MR          961     283
>>
>> Also, I found that Bourgon et al. (2010) do not recommend to combine 
>> the use of limma t-statistic with filtering. So please, I will 
>> appreciate your suggestion on whether filter or not filter my data set.
>>
>> Thanks in advance.
>> Miriam
>>
>>
>> ********************************
>> Miriam Garcia, MS, PhD
>> Department of Animal Sciences
>> University of Florida

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list