[BioC] EdgeR: dispersion estimation

Gordon K Smyth smyth at wehi.EDU.AU
Fri Apr 25 08:58:25 CEST 2014


No rounding.  Uses them as-is by implementing a continuous version of the 
negative binomial likelihood (factorials become gamma funtions).

This makes the behaviour of the edgeR glm functions similar to that of the 
original exact conditional likelihood edgeR functions, which were always 
able to use fractional counts.

Gordon

On Thu, 24 Apr 2014, Ryan Thompson wrote:

> Hi Gordon,
>
> What exactly does edgeR now do with non-integer counts? Does it just round
> them to the nearest integer and proceed as normal, or does it somehow use
> them as is?
>
> -Ryan
> On Apr 24, 2014 7:40 PM, "Gordon K Smyth" <smyth at wehi.edu.au> wrote:
>
>> Dear Yanzhu,
>>
>> My guess is that some of your "count data" are not integers.  For example,
>> are they perhaps expected counts from RSEM?  In the edgeR version that you
>> are using, the GLM dispersion estimation functions do not work correctly
>> for non-integer data.  (They weren't intended to.)
>>
>> Please update your copyies of R and edgeR to the latest versions.
>> Bioconductor 2.14 was released a couple of weeks ago.  All edgeR functions
>> now permit non-integer "counts".
>>
>> Also check that your data are counts and not RPKM or similar.  The counts
>> should sum to the total sequence depth for each sample.
>>
>> Best wishes
>> Gordon
>>
>>  Date: Wed, 23 Apr 2014 07:58:30 -0700 (PDT)
>>> From: "Yanzhu [guest]" <guest at bioconductor.org>
>>> To: bioconductor at r-project.org, mlinyzh at gmail.com
>>> Subject: [BioC] EdgeR: dispersion estimation
>>>
>>>
>>> Dear community,
>>>
>>> I use edgeR to do the data analysis of my RNA-seq project (as mentioned
>>> in my previous posts about multi-factor analysis of RNA-Seq project), I
>>> meet an issue with dispersion estimation:
>>> I first used estimateGLMCommonDisp and then used estimateGLMTagwiseDisp
>>> to estimate the dispersion, however, I got 3.999943 for y$common.dispersion
>>> and 0.0624991 for all of the y$tagwise.dispersion (all of the
>>> y$tagwise.dispersion are the same). isn't it that all of the tagwise
>>> dispersion should NOT be the same?
>>>
>>> The fellowing is the code I used:
>>> ##Read in count data
>>> T<-data.frame(HTSeqRE)
>>>
>>> ##Factors:
>>> Design<-data.frame(HTSeqCondRE[,2:4])
>>> Rep<-as.factor(Design$Rep)
>>> Line<-as.factor(Design$Line)
>>> Sex<-as.factor(Design$Sex)
>>> design<-model.matrix(~Line+Rep+Sex+Line:Rep+Line:Sex+Rep:
>>> Sex+Line:Sex:Rep)
>>>
>>> group<-paste(Design$Line,Design$Sex,Design$Rep,sep=".")
>>> y<-DGEList(counts=T,group=group)
>>>
>>>
>>> y<-calcNormFactors(y,method="TMM")
>>>
>>> y<-estimateGLMCommonDisp(y,design)
>>> y<-estimateGLMTagwiseDisp(y,design)
>>>
>>> y$common.dispersion
>>> [1] 3.999943
>>>
>>> y$tagwise.dispersion
>>> [1] 0.0624991 0.0624991 0.0624991 0.0624991 0.0624991
>>> 13474 more elements ...
>>>
>>>
>>> Yanzhu
>>>
>>> -- output of sessionInfo():
>>>
>>>  sessionInfo()
>>>>
>>> R version 3.0.1 (2013-05-16)
>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>
>>> locale:
>>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>> States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>> [5] LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>> base
>>>
>>> other attached packages:
>>> [1] DESeq_1.12.1       lattice_0.20-27    locfit_1.5-9.1
>>> Biobase_2.20.1     BiocGenerics_0.6.0 edgeR_3.2.4        limma_3.16.8
>>>
>>> loaded via a namespace (and not attached):
>>> [1] annotate_1.38.0      AnnotationDbi_1.22.6 DBI_0.2-7
>>>  genefilter_1.42.0    geneplotter_1.38.0   grid_3.0.1
>>> IRanges_1.18.4
>>> [8] RColorBrewer_1.0-5   RSQLite_0.11.4       splines_3.0.1
>>>  stats4_3.0.1         survival_2.37-4      tools_3.0.1          XML_3.98-1.1
>>> [15] xtable_1.7-3
>>>
>>> --
>>> Sent via the guest posting facility at bioconductor.org.
>>>
>>
>> ______________________________________________________________________
>> The information in this email is confidential and intend...{{dropped:4}}
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.
>> science.biology.informatics.conductor
>>
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list