[BioC] DESeq and number of replicates required for RNA-Seq

Naomi Altman naomi at stat.psu.edu
Mon Jun 14 18:42:45 CEST 2010

The issue is a mix of expression level and sample size.  For count 
data, the power is higher when the expression is higher.  Also, the 
p-values are discrete - the lower the total read count, the fewer 
values are possible, which messes up the FDR estimation.

Of course, understanding the problem does not necessarily suggest a 
solution.  But sample sizes will need to be large (or you need to 
sequence very deeply) if you want to detect differential expression 
in low expressing genes.


At 09:45 AM 6/14/2010, michael watson (IAH-C) wrote:
>This follows on slightly from my experimental design thread.
>Having worked through the vignette for DESeq, it seems to work 
>well.  However, for the TagSeqExample.tab data set, when using an 
>FDR cut off of 0.05, what we see is that we only find differential 
>expression for large fold changes - an average of log2 fold change 
>of 5 for up-regulated, and log2 fold change of -5 for 
>down-regulated.  There are very few significant results that even go 
>as far down as 2 or -2 - which is still a 4-fold change.
>So, the question is, how many replicates must we have to get more 
>sensitive results?  Say down to log2FC of 1? (two-fold up or down regulated)?
>I can calculate this by using DESeq's own estimates of variance to 
>approximate replicates for T and N in the example data, and keep 
>going until my significant results start to hit a logFC of 1, but I 
>wanted to know if anyone else had done this yet?
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>Search the archives: 

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111

More information about the Bioconductor mailing list