[BioC] edgeR - estimateGLMCommonDisp - warnings - huge logFC

Gordon K Smyth smyth at wehi.EDU.AU
Fri Jul 8 13:16:26 CEST 2011


Dear Ioannis,

If the counts are zeros for some libraries for some genes, then it should 
be no surprise that some of the logFC might be very large.  The raw fold 
changes are infinite.

The real problem though is that running estimateGLMCommonDisp() without 
replicates is meaningless, since the dispersion is not actually estimable 
without replicates.  The function will probably just return a dispersion 
of zero in this case.

If you must analyse RNA-Seq data without replicates, you could estimate 
the dispersion very roughly by treating all the libraries as if they were 
replicates, by

   d2 <- estimateCommonDisp(d), or
   d2 <- estimateGLMCommonDisp(d)

and then proceed using this conservative dispersion estimate.

Best wishes
Gordon

> Date: Fri, 8 Jul 2011 08:18:56 +0000
> From: "Filippis, Ioannis" <i.filippis at imperial.ac.uk>
> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
> Subject: [BioC] edgeR - estimateGLMCommonDisp - warnings - huge logFC
> Content-Type: text/plain
>
> Hi,
>
> I am using edgeR for a 2x2 factorial design (Strain*Treatment) without 
> any replicates and the estimateGLMCommonDisp and glmFit functions.
>
> When I run estimateGLMCommonDisp, I get warnings
> 1: In optimize(f = fun, interval = interval^0.25, y = y,  ... :
>  NA/Inf replaced by maximum positive value
> and when I run glmFit and then glmLRT, I get huge fold change values for some genes.
>
> However, if I do a pairwise exactTest for the samples examined for the above contrast, the fold change for that genes is high but normal.
>
> I would really appreciate any feedback on the cause of warnings and huge logFC.
>
> Many thanks for your help.
>
> Best,
> Ioannis

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list