[BioC] EdgeR: artifacts on BCV plot

Adriaan Sticker adriaan.sticker at gmail.com
Sun Feb 2 12:07:19 CET 2014


Thanks a lot for your input, Gordon!

I'm still a bit puzzeled why your deviance don't have to follow a chi
squared distribution when you estimate tagwise dispersion (that what you
looking at with the gof plot, I guess). I put an example of the GOF plots
in attachment. One plot based one a tagewise dispersion based on my
manually adjusted prior df of 25 and one when the prior.df is estimaded by
estimateDisp() at 9 and a third with robust estimation. I also put the
corresponding bcv plots for completness. It seems like you overestimate
your variation  for the higer values.
If the true deviances do not follow the theoretical expected chi^2
distribution under null, how are the p values you get from glmLRT function
still correct?
Maybe I understand this gof plot wrong, I noticed it's also not mentioned
in the manual.

Note that I also find 100 more differentially expressed genes with my
manual set prior.df (320 vs 219 genes) so it makes a big difference.

Greetings


2014-02-02 Gordon K Smyth <smyth at wehi.edu.au>:

> Dear Adriann,
>
>
> On Sun, 2 Feb 2014, Adriaan Sticker wrote:
>
>  Dear Gordon,
>>
>> Thanks a lot for your input. I tried the automatic prior.df estimation of
>> the estimateDisp() function. and its suggests a much lower prior.df then I
>> put mannually (9 instead of 25) But when I look at the gof plot, it's way
>> off. I thought that a good guide for a prior.df estimation is looking for
>> a
>> value that puts the calculated deviances as close as possible to the
>> theoretical espected values. This is the prior.df for which your deviances
>> are straight on the  diagonal line of gof / qq plot)
>>
>
> Not this isn't so.  The value returned by estimateDisp() is better.
>
> Plotting the gof is valid for showing that the common or trended
> dispersion models are inadequate, but the QQ plot of the GOF statistics
> doesn't work properly any more once the tagwise dispersions have been
> estimated.  This is because the tagwise dispersions are estimated from the
> same genewise data that is being plotted.
>
> I admit that we have not made that sufficiently clear in the documentation.
>
> Best wishes
> Gordon
>
>
>
>  Or am I wrong here?
>>
>> Best Regards
>> Adriaan
>>
>>
>> 2014-02-02 Gordon K Smyth <smyth at wehi.edu.au>:
>>
>>  Date: Fri, 31 Jan 2014 11:59:13 +0000
>>>
>>>> From: Adriaan Sticker <adriaan.sticker at gmail.com>
>>>> To: Ryan <rct at thompsonclan.org>
>>>> Cc: bioconductor at r-project.org
>>>> Subject: Re: [BioC] EdgeR: artifacts on BCV plot
>>>>
>>>> Hi
>>>> Thanks for your input. I checked manually the counts of the lowest BCV
>>>> values (see below) And I see nothing strange. Except the fact that the
>>>> counts are all at the low side. So I think I will keep them in.
>>>> Is it correct to think that the reason they appear on 1 horizontal line
>>>> is
>>>> because of the discreteness of the counts?
>>>>
>>>>
>>> No it is not because of discreteness.  It is because zero is
>>> mathematically a perfectly possible value for the BCV.
>>>
>>> These genes appear to show variability that is equal or less than Poisson
>>> variability, even after pulling them up towards the dispersion trend.  In
>>> other words, these genes are not showing any evidence of differences
>>> between biological replicates.
>>>
>>> Gordon
>>>
>>>  Greetings
>>>
>>>> Adriaan
>>>>
>>>>
>>> ______________________________________________________________________
>>> The information in this email is confidential and intended solely for the
>>> addressee.
>>> You must not disclose, forward, print or use it without the permission of
>>> the sender.
>>> ______________________________________________________________________
>>>
>>>
>>
> ______________________________________________________________________
> The information in this email is confidential and intended solely for the
> addressee.
> You must not disclose, forward, print or use it without the permission of
> the sender.
> ______________________________________________________________________
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bcv_manual_estimation.png
Type: image/png
Size: 29535 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140202/70830db2/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bcv_automatic_estimation.png
Type: image/png
Size: 31517 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140202/70830db2/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gof_manual_estimation.png
Type: image/png
Size: 11421 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140202/70830db2/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gof_automatic_estimation.png
Type: image/png
Size: 12172 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140202/70830db2/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gof_robust_estimation.png
Type: image/png
Size: 11644 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140202/70830db2/attachment-0004.png>


More information about the Bioconductor mailing list