[BioC] total count filter cutoff

Steve Lianoglou lianoglou.steve at gene.com
Wed Apr 30 23:25:09 CEST 2014


Hi,

On Wed, Apr 30, 2014 at 1:11 PM, Ryan C. Thompson <rct at thompsonclan.org> wrote:
> Filtering on raw counts has a statistical motivation, i.e. something like
> "we can't do statistics with less than X reads". Filtering on CPM is
> sometimes just used as a proxy for count-based filtering, but sometimes it
> also has a biological motivation, i.e. "we believe that CPM < X represents
> biological noise transcription rather than genuine regulated transcription
> relevant to the biological system in question". So you have to consider what
> your goals are for filtering and choose an appropriate method.

Even still, in the "biological motivation" case: if you want to use
CPM, shouldn't you really prefer {R|F}PKM so you don't "enrich" for
removal of lowly expressed short transcripts while letting lowly
expressed long transcripts slip through?

-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech



More information about the Bioconductor mailing list