[BioC] Limma package: problem with p-value and t

James W. MacDonald jmacdon at med.umich.edu
Thu Sep 17 23:11:45 CEST 2009


Hi Julie,

Your calculation of df is off. You lose one df for each column of your 
design matrix, so you have 2 residual df. The computation of the prior 
df is outlined (I believe) in the first paper cited in the limma User's 
Guide:

http://www.bepress.com/sagmb/vol3/iss1/art3

Best,

Jim



Julie Zhu wrote:
> Hi James,
> 
> Thanks you so much for helping me solving the mystery!  Could you please
> explain why the degree freedom is so large? I only have 4 arrays, so total 3
> df (n-1). One is used for estimating dye effect, 1 is used for estimating
> mutant effect, so the residual is 1 df. Why is there an additional of
> 2.397083 df added? Is it because the information borrowing from the
> neighboring genes? How is the df.prior calculated? Thanks so much for
> sharing your knowledge!
> 
> Best regards,
> 
> Julie
> 
> 
> On 9/17/09 4:50 PM, "James W. MacDonald" <jmacdon at med.umich.edu> wrote:
> 
>>
>> Julie Zhu wrote:
>>> Dear Gordon,
>>>
>>> I noticed that there is inconsistency between the t and p.value on the
>>> topTable output using Limma package to analyze a dye swap two channel array
>>> experiment. 
>>>         logFC  AveExpr        t      P.Value  adj.P.Val        B
>>> 1401 3.216254 11.74137 22.49546 1.041470e-05 0.07874703 3.435097
>>> 1537 3.344735 11.88007 18.07448 2.702061e-05 0.07874703 2.951114
>>>
>>> The two-tailed p-value for t=22.49546 with df=1 would be 0.028
>>> 2*pt(-abs(22.49546),df=1), but the P.Value in the above output is 1.04e-05.
>>> Did I miss something? Could you please explain the inconsistency? Thanks!
>> The degrees of freedom will be somewhere around 3, but a bit larger
>> because of the ebayes fit. You can get the actual value from your
>> MArrayLM object, so what you really want is
>>
>> 2*pt(-abs(22.49546), fit$df.prior + fit$df.residual[1])
>>
>> assuming you did something like
>>
>> fit <- lmFit(<stuff>)
>> fit <- eBayes(fit)
>>
>> Best,
>>
>> Jim
>>
>>
>>> Here is the design matrix and relevant code snippets.
>>>
>>>    Dye mutant
>>>    1     -1
>>>    1     -1
>>>    1      1
>>>    1      1
>>>
>>> design = cbind(Dye=1, mutant = c(-1,-1,1,1))
>>> fit = lmFit(MA, design)
>>> ### MA is a MAlist object
>>> fit3 = eBayes(fit)
>>> topTable(fit3, number=2, coef="mutant", adjust="fdr")
>>>
>>> Here is my R session information.
>>>
>>> R version 2.9.0 (2009-04-17)
>>> i386-apple-darwin8.11.1
>>>
>>> locale:
>>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] limma_2.18.0
>>>
>>> Thank you very much for your help!
>>>
>>> Best regards,
>>>
>>> Julie
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826



More information about the Bioconductor mailing list