[BioC] EdgeR condition-specific dispersion

Thu Oct 4 23:27:59 CEST 2012

Hi Thomas,

A couple thoughts below …

On 02.10.2012, at 19:15, Thomas Frederick Willems wrote:

> I'm dealing with a factorial RNA-seq data set in which cells have been stimulated with various combinations of extra-cellular cues. As such, I was interested in applying the GLM framework in edgeR to assess the contribution of each extra-cellular cue to the differential expression of certain genes. My concern, however, is that both the expression level and the dispersion of each gene varies greatly with the combination of cues. EdgeR doesn't seem to estimate condition-specific dispersion but rather one dispersion per gene (if the tagwise options is used). My question is therefore two-fold:

> 1) Does it make sense to want to estimate condition-specific dispersions?

Maybe.  I haven't seen too much evidence of this in data I've analyzed.  Maybe you could show a compelling example?

> 2) Is there a way to modify the edgeR framework so that it does this?

It's not so easy.  Unless I'm mistaken, the standard likelihood ratio test isn't able to handle this setting.  A conservative approach would be to estimate the dispersions using the more-variable state, and use these in the DE analysis.  But, maybe then your dispersion estimates are less accurate (using less data) and it doesn't buy you much in the end.

A recent paper shows an extension that might be able to handle this more general situation, but I haven't figured out all the details yet:
http://biostatistics.oxfordjournals.org/content/early/2012/09/16/biostatistics.kxs031.short

Hope that helps.

Best, Mark

> 
> Thanks
> 
> Thomas
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor