[BioC] limma eBayes: how to determine goodness of fit?

Wed Jun 6 12:22:28 CEST 2007

Hi Paul.

Thanks for the explanation, I see what you are doing.  I would call  
what you are doing 'model selection' and its true, limma doesn't do  
that.  But, you can probably fit a 'full' model and just look for the  
differences that you describe, all within limma.

For example, you could fit a model:
~ -1 + M1 + M2 + M3

where M1-M3 are binary indicators of the signal samples.  So this  
fits a model with no intercept and a different mean for each 'signal'  
sample.  If I understand your problem correctly, you are looking for  
genes that are non-zero in each of coefficients for M1-M3 (like you  
gene1 and gene2).  But, you are also interested in genes which have  
non-zero in M1,M2 and maybe not so in M3 (your gene3).  These are  
just contrasts.  So, you should be able to look for everything you  
are interested in, by constructing contrasts on M1-M3.

Alternatively, you could fit all the possible models you are  
interested in and filter all the topTables.  There are not that many.

Just one other note ....

> I was happy to see that I found small residuals, and a high R- 
> squared when I modeled gene3
> like this:

I think you'll find that small residuals (or at least small relative  
to the signal) and high R-squares correspond to large (in magnitude)  
t-statistic or large Fs.  So, everything you need is in limma.

Hope that helps.

Cheers,
Mark