[BioC] cpm cutoff (edgeR) [was: total count filter cutoff (edgeR)]

Gordon K Smyth smyth at wehi.EDU.AU
Fri Jun 20 10:20:35 CEST 2014

Dear Daniel,

The guidelines are the same for miRNA-seq as for RNA-seq.  At its 
simplest, you simply need a reasonable minimum number of counts in at 
least some samples.  At a minimum, I think you would want at least 5 
counts, maybe more, in each sample in which the gene in expressed.

So, if your sequencing depth is about 20 million reads per sample, you 
might ask for cpm>0.3 (equivalent to 6 reads). If your sequence depth is 
10 million read per sample, you might ask for cpm>0.6 (again equivalent to 
6 reads).  It's not rocket-science.  It's all quite rough as the exact 
cutoff isn't important.

Best wishes

> Date: Thu, 19 Jun 2014 10:10:36 +0000
> From: Daniel <daniel.nicorici at gmail.com>
> To: <bioconductor at stat.math.ethz.ch>
> Subject: Re: [BioC] total count filter cutoff (edgeR)
> Gordon K Smyth <smyth at ...> writes:
>> Hi Mahnaz,
>> Why don't you follow the advice of the edgeR User's Guide (as Mark has
>> suggested)?  All the case studies in the User's Guide describe how the
>> filtering was done in a principled way.
>> Total count filtering is not so bad, but it is susceptible to being driven
>> by one library, especially by one library with a large sequence depth.
>> The procedure described by Mark and used in the guide is a compromise of
>> several considerations.
>> BTW, there are newer versions of R and edgeR available than what you are
>> using.
>> Best wishes
>> Gordon
> Hello,
> in case that one has miRNA (i.e. microRNA) data what is a good suggestion
> for the cpm cutoff? Is it the same like for RNA-seq?
> I have not found a recommendation/case/example in the edgeR manuals/guides
> for miRNA-seq data analysis.
> Best Wishes,
> Daniel

The information in this email is confidential and intend...{{dropped:4}}

More information about the Bioconductor mailing list