[BioC] DESeq and number of replicates required for RNA-Seq
naomi at stat.psu.edu
Mon Jun 14 18:42:45 CEST 2010
The issue is a mix of expression level and sample size. For count
data, the power is higher when the expression is higher. Also, the
p-values are discrete - the lower the total read count, the fewer
values are possible, which messes up the FDR estimation.
Of course, understanding the problem does not necessarily suggest a
solution. But sample sizes will need to be large (or you need to
sequence very deeply) if you want to detect differential expression
in low expressing genes.
At 09:45 AM 6/14/2010, michael watson (IAH-C) wrote:
>This follows on slightly from my experimental design thread.
>Having worked through the vignette for DESeq, it seems to work
>well. However, for the TagSeqExample.tab data set, when using an
>FDR cut off of 0.05, what we see is that we only find differential
>expression for large fold changes - an average of log2 fold change
>of 5 for up-regulated, and log2 fold change of -5 for
>down-regulated. There are very few significant results that even go
>as far down as 2 or -2 - which is still a 4-fold change.
>So, the question is, how many replicates must we have to get more
>sensitive results? Say down to log2FC of 1? (two-fold up or down regulated)?
>I can calculate this by using DESeq's own estimates of variance to
>approximate replicates for T and N in the example data, and keep
>going until my significant results start to hit a logFC of 1, but I
>wanted to know if anyone else had done this yet?
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>Search the archives:
Naomi S. Altman 814-865-3791 (voice)
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor