[BioC] edgeR question

Naomi Altman naomi at stat.psu.edu
Sat Jun 26 15:21:01 CEST 2010


Dear Gordon,
Thank you for your very detailed and clear answer to my question 
about the dispersion model.

Regarding FDR:
For discrete-valued test statistics, the distribution of the p-values 
under the null hypothesis is a discrete uniform which depends on the 
marginal total.  As a result,
under the distribution of p-values from the null hypotheses is a 
mixture of discrete uniforms, which can be marginally very 
non-uniform.  Even after filtering out low expressing genes, it is 
common to see a peak of p-values near 1.0 due to this effect.  It is 
less evident that there are multiple other peaks, one at each of the 
discrete values of the p-value for each marginal total.  The result 
of this is that FDR computations are far too conservative for lowly 
expressing genes, and far too liberal for highly expressing genes 
which basically magnifies the power differential that already exists 
due to the relationship between the mean and variance.

--Naomi

At 05:01 AM 6/26/2010, Gordon K Smyth wrote:
>Dear Zhe,
>
>To get FDR, you must use the topTags() function.  Is your de.com 
>object a deDGEList object?  If it is, then
>
>   top <- topTags(de.com, n=Inf)
>   write.table(top$table, file="yourfile.txt")
>
>will do what you want.  (I can't tell you what level of FDR to use 
>as your cutoff though, that's up to you.)
>
>Naomi, I don't know of any problem with FDR from edgeR.  It should 
>work just fine.
>
>Best wishes
>Gordon
>
>-----------------------------------------------
>Associate Professor Gordon K Smyth,
>NHMRC Senior Research Fellow,
>Bioinformatics Division, Walter and Eliza Hall Institute of Medical 
>Research, 1G Royal Parade, Parkville, Vic 3052, Australia.
>smyth at wehi.edu.au
>http://www.wehi.edu.au
>http://www.statsci.org/smyth
>
>
>
>------------ original message ---------------
>[BioC] edgeR question
>Naomi Altman naomi at stat.psu.edu
>Fri Jun 25 22:43:51 CEST 2010
>
>Hi Zhe,
>1. First normalize and then do the DE
>analysis.  (I found this confusing in the vignette, too.)
>
>2. I do not suggest using FDR at this time.  The
>standard FDR computations need to be adjusted for
>count data.  I do not think this has been worked out yet.
>
>--Naomi
>
>
>At 12:21 PM 6/25/2010,  wrote:
>
>>Hello,
>>
>>I am learning edgeR and would like to use it
>>dealing with my Tag-seq and RNA-seq data. I have several questions:
>>
>>1. Does the DE analysis using common
>>dispersion or moderated tagwise dispersions use
>>the TMM method for normalization?  I am not
>>sure the relationship between Setion 6
>>(Normalization) and the following sections in
>>the user manual. I suppose I should normalize
>>the data first, and then perform DE analysis.
>>
>>2. Do you suggest to use P-value < 0.01? What
>>about FDR < 0.05? After saving de.tagwise (>
>>write.table(de.com[[1]], file =
>>"/Users/Zhe/edgeR/page7", sep = "\t")), I found
>>there is not a column of the FDR. How to
>>calculate the FDR for each gene and save it in the output file.
>>
>>Thanks a lot.
>>Best wishes,
>>
>>Zhe
>
>______________________________________________________________________
>The information in this email is confidential and intend...{{dropped:4}}
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111



More information about the Bioconductor mailing list