[BioC] DEXSeq update results change

António Miguel de Jesus Domingues amjdomingues at gmail.com
Wed Aug 20 08:41:16 CEST 2014


Dear Wolfgang and Alejandro,

First of all, thank you for looking into this.

can you send one ore more specific examples, i.e.
> - the count table for the affected gene(s), for all its exons, and/or the
> plotDEXSeq output
> - the size factorss
>

I have prepared a data set+script for testing that will follow in a
separate private email, so that you can look into this in detail. While
preparing it I think I spotted where the difference in results might
originate *(1)*.

Let me clarify that my concern is not with a particular exon, but rather
with the general trend (ratio of up-regulated / down-regulated exons) that
is changed, particularly in the experimental set-up I am sending you.

That also leads to the second point - with only two replicates per
> condition, expectations about reproducibility of the result should be
> modest. No amount of statistical software can undo that.
>

I am well aware of that :) In defence of data, I should say that the
experimental validation of the DGE results (for this same data) was nearly
100%. So yes, few replicates can be an issue, but we have some experimental
validation to give us assurance that not all is bad.

@ Alejandro

> Just an additional question, do you see the shift in fold changes for all
> your exons or only for a subset of them?
> In older versions there was a bug that was causing some label swaps in the
> result columns, but this should be fixed in the most recent versions (I
> just want to make sure it is fixed!). As Wolfgang mentions, this would
> become evident by looking at the plotDEXSeq output (by looking at the
> normalized counts and exon usage).
>


The scatter plot of fold change of new vs old version is a bit funky I must
say:
https://www.dropbox.com/s/l3snr4epgwbkty8/foldchange_comparison.png


*(1) *
while playing with the example data to send you, I noticed what could be an
explanation while counting significantly changed exons:

https://www.dropbox.com/s/7zc4n352ftjzqqe/nHits_comparison.pdf

In the old version of DEXseq without a fold-change cut-off, there are more
exons with decreased inclusion than with increased inclusion (~2500/1500
exons). With increasingly higher fold-change cut-offs this is inverted. For
instance with fc 10% is 2000/1500, and with  fc of 50% is 80/400. So a
completely different trend. Using the new DEXSeq version, changing the FC
cut-off makes no difference: the trend is always more exons with increased
inclusion, which is sort of what I would expect.

Could it be that the old version is less efficient in estimating the
fold-changes when the differences are minor. Well, not estimating
fold-changes but rather the dispersions. That would explain the differences
I observed. And we only have 2 replicates so we cannot expect miracles from
DEXSeq.

Best regards,
António


On 16 August 2014 12:24, Wolfgang Huber <whuber at embl.de> wrote:

> Dear Antonio
>
> can you send one ore more specific examples, i.e.
> - the count table for the affected gene(s), for all its exons, and/or the
> plotDEXSeq output
> - the size factorss
>
> This should help all of us understand better, and perhaps fix, what you’re
> unhappy about.
> What DEXSeq does is not a black box, it is in fact very simple, so we
> should be able to get to the bottom of this.
>
> Regarding the question in the second paragraph: if you have reason to
> assume that the biological variability is the same in all your conditions
> (knockdowns), then the joint dispersion estimation will be more precise.
> But it is not biologically implausible that the assumption may be wrong
> (e.g. because of the different efficiency of RNAi), leading to
> underestimating of the true biological variability (and there over-calling
> of results) in some conditions.
>
> That also leads to the second point - with only two replicates per
> condition, expectations about reproducibility of the result should be
> modest. No amount of statistical software can undo that.
>
> Best wishes
>         Wolfgang
>
>

-- 

-- 
António Miguel de Jesus Domingues, PhD
Postdoctoral researcher
Deep Sequencing Group - SFB655
Biotechnology Center (Biotec)
Technische Universität Dresden
Fetscherstraße 105
01307 Dresden

Phone: +49 (351) 458 82362
Email: antonio.domingues(at)biotec.tu-dresden.de
--
The Unbearable Lightness of Molecular Biology

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list