[BioC] How does limma t-test work?

James W. MacDonald jmacdon at uw.edu
Fri Aug 31 15:08:46 CEST 2012


Hi Jorge,

On 8/30/2012 5:33 PM, Jorge Miró wrote:
> Hi,
>
> I am using the limma package for an analysis of differential
> expression and have a question about how the t-test in limma works. To
> my understanding a difference between the usuall t-test and the one
> used in limma is that the standard error in limma is calculated by
> using a linear method based on a Bayesian model (I don't really get
> how it works ).
>
> In the users guide of limma it says that "This has the same
> interpretation as an ordinary t-statistic except that the standard
> errors have been moderated across genes, i.e., shrunk towards a common
> value, using a simple Bayesian model. This has the eect of borrowing
> information from the ensemble of genes to aid with inference about
> each individual gene". What exactly does it mean to borrow information
> from other genes? Is it for example the standard error of a gene on
> different arrays than the ones been compared or the standard error of
> all other genes in the same arrays being compared that is being used
> in the calculations?

It is based on all other genes on the array you are using. The rationale 
for doing this stems from the fact that the sample variance is not an 
efficient statistic, which means that it takes a certain number of 
observations before the sample variance converges towards the true 
underlying variance that we are trying to estimate. In many microarray 
analyses, we have far fewer observations than is really required to get 
a good estimate of the variance, so we want to increase the precision of 
this estimate.

One way to do that is to compute an expected variance that we think has 
a higher probability of being representative of the true underlying 
variance, and then adjust our observed values towards this expected 
variance. This is what the eBayes() step does. It first computes an 
'average' variance, based on all the genes on your array (which will be 
more accurate because it is based on so much data). Then for each gene 
we compute the sample variance, and then adjust that value towards the 
expected variance that we computed from all genes.

Best,

Jim


>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list