[BioC] total count filter cutoff

Steve Lianoglou lianoglou.steve at gene.com
Wed Apr 30 23:25:09 CEST 2014


On Wed, Apr 30, 2014 at 1:11 PM, Ryan C. Thompson <rct at thompsonclan.org> wrote:
> Filtering on raw counts has a statistical motivation, i.e. something like
> "we can't do statistics with less than X reads". Filtering on CPM is
> sometimes just used as a proxy for count-based filtering, but sometimes it
> also has a biological motivation, i.e. "we believe that CPM < X represents
> biological noise transcription rather than genuine regulated transcription
> relevant to the biological system in question". So you have to consider what
> your goals are for filtering and choose an appropriate method.

Even still, in the "biological motivation" case: if you want to use
CPM, shouldn't you really prefer {R|F}PKM so you don't "enrich" for
removal of lowly expressed short transcripts while letting lowly
expressed long transcripts slip through?


Steve Lianoglou
Computational Biologist

More information about the Bioconductor mailing list