[BioC] get full variance per gene from DESeq

Graham Thomas graham.thomas at ed.ac.uk
Thu Aug 5 12:55:46 CEST 2010


Hi Wolfgang,

Thank you for your reply to my previous message. Due to the low number 
of biological conditions I am comparing, and the algorithm I would 
ideally like to use (that proposed in Jiang and Gentleman (2007)) I am 
inclined to think that I will obtain greater power if I can obtain a 
single z-score for each gene and each replicate (6 measurements) rather 
than per replicate (which is 2).

As far as I'm aware 3 measurements per group will improve my power in 
permutation testing during GSEA, if this is not the case then I may test 
using the z-scores obtained as suggested with the following:

with(BNevBTG,
     qnorm(pval/2) * sign(log2FoldChange)
   )

Is this correct?

Otherwise, I require a method to obtain an appropriate variance estimate 
for each gene per condition. I can outline the approach I am considering 
(and the problem I have) with the following argument:-

geneA geneB
1) 50 200
2) 55 150
3) 45 175

My thinking is that in order to obtain appropriate z-scores for each 
measurement geneA[1,] -> geneA[3,] (and, of course, as appropriate for 
geneB) I may use the following:

vf <- rawVarFunc( cds, "geneA)" )

And then add the variance due to the Poisson process (this is what I am 
unclear about how to obtain). With this I can use baseMeanA to generate 
z-scores for each measurement in the group. As a sanity check at this 
point it may not be absurd to do some kind of non-specific filtering 
based on the residuals I get?

Is this a reasonable approach to calculating my z-scores and, if so, how 
do I get the variance due to the counting process? If not, any comments 
or suggestions on how to proceed would be greatly appreciated! Apologies 
for the verbose explanation!

Regards,
Graham



On 02/08/10 11:00, bioconductor-request at stat.math.ethz.ch wrote:
> Message: 2 Date: Sun, 01 Aug 2010 22:50:57 +0200 From: Wolfgang Huber 
> <whuber at embl.de> To: bioconductor at stat.math.ethz.ch Subject: Re: 
> [BioC] get full variance per gene from DESeq Message-ID: 
> <4C55DE31.1010700 at embl.de> Content-Type: text/plain; 
> charset=ISO-8859-1; format=flowed Hi Graham an alternative, and maybe 
> easier and slightly more consistent approach might be to use the 
> p-values (and the sign of the foldChange) to transform to z-scores, 
> perhaps like this: with(BNevBTG, qnorm(pval/2) * sign(log2FoldChange) 
> ) PS: "more consistent" because the residuals distribution (of the 
> counts is skewed), while the p-values are uniform under the null. Best 
> wishes Wolfgang On Jul/30/10 11:47 AM, Graham Thomas wrote:
>> >  Hi All,
>> >
>> >  I am wondering whether there is an easy way of obtaining the full variance
>> >  for a given gene and condition from my DESeq analysis?
>> >
>> >  When I take a look at my results I end up with a table like so):
>> >
>> >    >  head ( BNevBTG )
>> >  id baseMean baseMeanA baseMeanB foldChange log2FoldChange
>> >  1 ENSMUSG00000001627 162.62034 119.50785 205.73284 1.7215006 0.78366671
>> >  2 ENSMUSG00000001630 45.51063 41.94099 49.08027 1.1702220 0.22678230
>> >  3 ENSMUSG00000001632 1626.19532 1328.32256 1924.06807 1.4484946 0.53455431
>> >  4 ENSMUSG00000001642 54.09075 54.53378 53.64772 0.9837521 -0.02363326
>> >  5 ENSMUSG00000001655 0.00000 0.00000 0.00000 NaN NaN
>> >  6 ENSMUSG00000001656 0.00000 0.00000 0.00000 NaN NaN
>> >  pval padj resVarA resVarB
>> >  1 9.930922e-05 0.0005666467 0.4043083 2.7960349
>> >  2 4.628048e-01 0.6573258560 0.1815569 0.5295679
>> >  3 4.216269e-04 0.0021251136 0.3436054 0.3589968
>> >  4 9.559678e-01 1.0000000000 1.0424044 0.8391154
>> >  5 NA NA NA NA
>> >  6 NA NA NA NA
>> >
>> >
>> >  I would like to transform my results into z-scores for GSEA purposes. As
>> >  far as I understand in order to do this I require my baseMean value
>> >  (which I have), my baseMeanA(and B) values, and the full variance (which
>> >  I want).
>> >
>> >  It may be noted that my practical statistics knowledge is HEAVILY
>> >  limited so any helpp at all here is greatly appreiciated!
>> >
>> >
>> >  Regards,
>> >  Graham
>> >
>> >
>>      
> -- Wolfgang Huber EMBL 
> http://www.embl.de/research/units/genome_biology/huber 
> ------------------------------


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the Bioconductor mailing list