[BioC] Differential expression testing for groups with unequal variances/dispersions?

Ryan C. Thompson rct at thompsonclan.org
Sat May 25 20:43:55 CEST 2013


Hi Gordon,

Thanks for the tips. You say that edgeR should be conservative when the 
equal dispersion assumption is violated, but this is not my experience. 
(I probably wouldn't have asked here on the list unless I was worried 
about false positives.) What I've seen is that will all 4 groups 
included in a single analysis, the low-dispersion time points drag down 
to overall dispersion estimate, and this results in (apparently) 
anticonservative results when testing for differential modification 
between the two high-dispersion time points. Obviously, I don't have a 
gold standard to compare against to conclude that the test is 
anticonservative, but I can compare to the results to previous analyses 
that I did before the final low-dispersion time point had come off the 
sequencer, and as expected, including the low-dispersion timepoint 
inflated the significance of most P-values in all contrasts.

So, to get around this, would you recommend testing between time points 
by first subsetting the DGEList to just the two time points being 
compared and then re-estimating the dispersions, then finally 
conducting the test? That way, each individual test would be 
"self-contained" and not affected by groups that are not being tested. 
I could imagine that under these conditions, edgeR might be 
conservative, as you say.

-Ryan Thompson

On Sat May 25 04:28:39 2013, Gordon K Smyth wrote:
> Hi Ryan,
>
> edgeR can't.
>
> voom can, but you have to put it together partly yourself.  Just fit
> voom to each timepoint separately, then cbind the voom output objects
> back together.
>
> Or else just proceed in edgeR as if the dispersions are equal across
> timepoints.  This will be conservative but won't give false positive
> results.
>
> Best wishes
> Gordon
>
>> Date: Fri, 24 May 2013 12:10:09 -0700
>> From: "Ryan C. Thompson" <rct at thompsonclan.org>
>> To: bioconductor <Bioconductor at r-project.org>
>> Subject: [BioC] Differential expression testing for groups with
>>     unequal    variances/dispersions?
>>
>> Hi all,
>>
>> I am studying a ChIP-Seq dataset (looking at gene promoter regions in
>> human) where it appears that different experimental groups have widely
>> different dispersions/variances using edgeR/limma. I have 4 timepoints,
>> and if I use edgeR to compute the dispersion for each timepoint
>> separately, I get:
>>
>> 0 hours: 0.407
>> 24 hours: 0.505
>> 120 hours: 0.115
>> 2 weeks: 0.0531
>>
>> So the dispersion seems to range from 0.05 to 0.5. I am looking to test
>> for "differential modification" between these timepoints, as well as
>> between cell types at each timepoint, etc., and I was wondering if there
>> is any differential expression test (or dispersion estimation method?)
>> that can handle groups with different dispersions/variances.
>>
>> For reference, here is my experimenal design as an Excel spreadsheet:
>> https://www.dropbox.com/s/3vnk4mai3dh39yv/chipseq-samples.xlsx
>>
>> And here is the result of plotBCV on each group (look at the last 4
>> pages for the time point groups):
>> https://www.dropbox.com/s/s4caq1p0h3e4zhm/groupdisps.pdf (Warning: big
>> PDF with lots of points which may bring your PDF reader to its knees.)
>>
>> -Ryan Thompson
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:6}}



More information about the Bioconductor mailing list