[BioC] Expt. design question on optimal number of replicates (in edgeR or elsewhere)

Gordon K Smyth smyth at wehi.EDU.AU
Sat May 26 13:08:55 CEST 2012


Dear Gowthaman,

There's no rigorous answer to this question, because it depends on the 
variability of your population, how large the fold changes are that you 
want to detect, how many genes you need to find, what FDR you can 
tolerate, etc etc, and it's impossible to know all these things in 
advance.

However, we can learn from experience.  At our institute, we regularly 
undertake mRNA-seq experiments using genetically identical mice for which 
we can keep the biological coefficient of variation between replicates 
down to about 10%.  For these experiments, three or four biological 
replicates per group works well.  If you can keep the same consistency, 
and the expression changes you want to detect are not too small, then 
three or four can work well for you also.

If you work with more heterogeneous organisms (like humans), or use less 
well controlled protocols (like aggressive RNA amplification), or look for 
subtle fold changes, larger numbers will likely be needed.

Best wishes
Gordon

> Date: Fri, 25 May 2012 09:33:45 -0700
> From: gowtham <ragowthaman at gmail.com>
> To: bioconductor <bioconductor at r-project.org>
> Subject: [BioC] Expt. design question on optimal number of replicates
> 	(in edgeR or else where)
>
> Hi Everyone,
> Thanks to recent bioconductor workshop i atteneded ( and of course thanks
> to Martin Morgan's inspiration) I am stepping out of hist/plot functions in
> R to use bioconductor for more powerful analysis. We have many RNAseq
> libries with out replicates. And I read edgeR document and understand, not
> much use of doing any significant analysis.
>
> But, now, we are in a position to have biological replicates. But, we are
> trying to decide the number. I understand more is merrier. But, what is a
> good number? If that is too vaguge to suggest a number....we plan for 4
> biological replicates of each condition. Is that good enough ?
>
> Couple more information on the project:
> 1) Aim of the project is to identify mRNAs that are bound to one
> translational factor (compared to another factor)
> 2) Our organism has 8,000 genes
> 3) We use a modified RNAseq where each read represents one mRNA transcript.
> 4) and our library usually contains 10 or more transcript per gene (>80%
> cases) per Million mapped reads.
> 5) this is a first step/survey experiment to see what class of genes
> are differentially bound
>
> I appreciate your help/pointers,
> If this has been discussed before, could you please point me towards that.
>
> gowthaman
>
>
> -- 
> Gowthaman
>
> Bioinformatics Systems Programmer.
> SBRI, 307 West lake Ave N Suite 500
> Seattle, WA. 98109-5219
> Phone : LAB 206-256-7188 (direct).

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list