[BioC] DEXSeq update results change

António Miguel de Jesus Domingues amjdomingues at gmail.com
Thu Aug 21 11:57:45 CEST 2014


Hi Alejandro,

thanks again for looking into this.



> I had a look at your data, apparently the difference in dispersion
> estimates between the old and the new versions of DEXSeq can make a
> difference in the coefficients of the GLM, therefore the exon fold
> changes.  But this changes seem to be specifically affecting only those
> exons with very low counts.



This is very re-assuring and makes sense. The new version is teh way to go
then :)

Best regards,
António



> For example, with the objects that you send me:
>
> select <- rowSums( dxr$countData ) > 10
> plot( dxr_new$`log2fold_3_c_GFP_c`[select], dxr_old$`log2fold.3_c_c.GFP_c_c.`[select]
> )
>
> These numbers/plots give a much more reasonable picture. These differences
> are from those exons where noise is predominant. I will dig more into this,
> but I would not worry so much about it, the signs for the significant exons
> are anyway consistent:
>
> select2 <- which(dxr_old$padjust < 0.1)
> table( dxr_new$`log2fold_3_c_GFP_c`[select2] > 0 ,
> dxr_old$`log2fold.3_c_c.GFP_c_c.`[select2] > 0)
>
>       FALSE TRUE
> FALSE  1630    0
> TRUE      0  614
>
> Best regards,
> Alejandro
>
>
>
>
>  Dear Wolfgang and Alejandro,
>>
>> First of all, thank you for looking into this.
>>
>>     can you send one ore more specific examples, i.e.
>>     - the count table for the affected gene(s), for all its exons,
>>     and/or the plotDEXSeq output
>>     - the size factorss
>>
>>
>> I have prepared a data set+script for testing that will follow in a
>> separate private email, so that you can look into this in detail. While
>> preparing it I think I spotted where the difference in results might
>> originate *(1)*.
>>
>>
>> Let me clarify that my concern is not with a particular exon, but rather
>> with the general trend (ratio of up-regulated / down-regulated exons) that
>> is changed, particularly in the experimental set-up I am sending you.
>>
>>     That also leads to the second point - with only two replicates per
>>     condition, expectations about reproducibility of the result should
>>     be modest. No amount of statistical software can undo that.
>>
>>
>> I am well aware of that :) In defence of data, I should say that the
>> experimental validation of the DGE results (for this same data) was nearly
>> 100%. So yes, few replicates can be an issue, but we have some experimental
>> validation to give us assurance that not all is bad.
>>
>> @ Alejandro
>>
>>     Just an additional question, do you see the shift in fold changes
>>     for all your exons or only for a subset of them?
>>     In older versions there was a bug that was causing some label
>>     swaps in the result columns, but this should be fixed in the most
>>     recent versions (I just want to make sure it is fixed!). As
>>     Wolfgang mentions, this would become evident by looking at the
>>     plotDEXSeq output (by looking at the normalized counts and exon
>>     usage).
>>
>>
>>
>> The scatter plot of fold change of new vs old version is a bit funky I
>> must say:
>> https://www.dropbox.com/s/l3snr4epgwbkty8/foldchange_comparison.png
>>
>>
>> *(1) *
>>
>> while playing with the example data to send you, I noticed what could be
>> an explanation while counting significantly changed exons:
>>
>> https://www.dropbox.com/s/7zc4n352ftjzqqe/nHits_comparison.pdf
>>
>> In the old version of DEXseq without a fold-change cut-off, there are
>> more exons with decreased inclusion than with increased inclusion
>> (~2500/1500 exons). With increasingly higher fold-change cut-offs this is
>> inverted. For instance with fc 10% is 2000/1500, and with  fc of 50% is
>> 80/400. So a completely different trend. Using the new DEXSeq version,
>> changing the FC cut-off makes no difference: the trend is always more exons
>> with increased inclusion, which is sort of what I would expect.
>>
>> Could it be that the old version is less efficient in estimating the
>> fold-changes when the differences are minor. Well, not estimating
>> fold-changes but rather the dispersions. That would explain the differences
>> I observed. And we only have 2 replicates so we cannot expect miracles from
>> DEXSeq.
>>
>> Best regards,
>> António
>>
>>
>> On 16 August 2014 12:24, Wolfgang Huber <whuber at embl.de <mailto:
>> whuber at embl.de>> wrote:
>>
>>     Dear Antonio
>>
>>     can you send one ore more specific examples, i.e.
>>     - the count table for the affected gene(s), for all its exons,
>>     and/or the plotDEXSeq output
>>     - the size factorss
>>
>>     This should help all of us understand better, and perhaps fix,
>>     what you’re unhappy about.
>>     What DEXSeq does is not a black box, it is in fact very simple, so
>>     we should be able to get to the bottom of this.
>>
>>     Regarding the question in the second paragraph: if you have reason
>>     to assume that the biological variability is the same in all your
>>     conditions (knockdowns), then the joint dispersion estimation will
>>     be more precise. But it is not biologically implausible that the
>>     assumption may be wrong (e.g. because of the different efficiency
>>     of RNAi), leading to underestimating of the true biological
>>     variability (and there over-calling of results) in some conditions.
>>
>>     That also leads to the second point - with only two replicates per
>>     condition, expectations about reproducibility of the result should
>>     be modest. No amount of statistical software can undo that.
>>
>>     Best wishes
>>             Wolfgang
>>
>>
>>
>> --
>> --
>> António Miguel de Jesus Domingues, PhD
>> Postdoctoral researcher
>> Deep Sequencing Group - SFB655
>> Biotechnology Center (Biotec)
>> Technische Universität Dresden
>> Fetscherstraße 105
>> 01307 Dresden
>>
>> Phone:+49 (351) 458 82362  <tel:%2B49%20%28351%29%20458%2082362>
>> Email: antonio.domingues(at)biotec.tu-dresden.de  <
>> http://biotec.tu-dresden.de>
>>
>> --
>> The Unbearable Lightness of Molecular Biology
>>
>
>


-- 

-- 
António Miguel de Jesus Domingues, PhD
Postdoctoral researcher
Deep Sequencing Group - SFB655
Biotechnology Center (Biotec)
Technische Universität Dresden
Fetscherstraße 105
01307 Dresden

Phone: +49 (351) 458 82362
Email: antonio.domingues(at)biotec.tu-dresden.de
--
The Unbearable Lightness of Molecular Biology

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list