> Filtering on raw counts has a statistical motivation, i.e. something like
> "we can't do statistics with less than X reads". Filtering on CPM is
> sometimes just used as a proxy for count-based filtering, but sometimes it
> also has a biological motivation, i.e. "we believe that CPM < X represents
> biological noise transcription rather than genuine regulated transcription
> relevant to the biological system in question". So you have to consider what
> your goals are for filtering and choose an appropriate method.

Even still, in the "biological motivation" case: if you want to use
CPM, shouldn't you really prefer {R|F}PKM so you don't "enrich" for
removal of lowly expressed short transcripts while letting lowly
expressed long transcripts slip through?


