[BioC] total count filter cutoff

Steve Lianoglou lianoglou.steve at gene.com
Wed Apr 30 23:56:42 CEST 2014


Sorry, didn't mean to have that come across as a "correction" ... just
wanted to add some more confusion (or clarity(?)) to the debate is all
;-)

-steve

On Wed, Apr 30, 2014 at 2:49 PM, Ryan C. Thompson <rct at thompsonclan.org> wrote:
> Yes, that is a good point that I forgot to mention. Thanks for correcting
> me.
>
> -Ryan
>
>
> On Wed 30 Apr 2014 02:25:09 PM PDT, Steve Lianoglou wrote:
>>
>> Hi,
>>
>> On Wed, Apr 30, 2014 at 1:11 PM, Ryan C. Thompson <rct at thompsonclan.org>
>> wrote:
>>>
>>> Filtering on raw counts has a statistical motivation, i.e. something like
>>> "we can't do statistics with less than X reads". Filtering on CPM is
>>> sometimes just used as a proxy for count-based filtering, but sometimes
>>> it
>>> also has a biological motivation, i.e. "we believe that CPM < X
>>> represents
>>> biological noise transcription rather than genuine regulated
>>> transcription
>>> relevant to the biological system in question". So you have to consider
>>> what
>>> your goals are for filtering and choose an appropriate method.
>>
>>
>> Even still, in the "biological motivation" case: if you want to use
>> CPM, shouldn't you really prefer {R|F}PKM so you don't "enrich" for
>> removal of lowly expressed short transcripts while letting lowly
>> expressed long transcripts slip through?
>>
>> -steve
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor



-- 
Steve Lianoglou
Computational Biologist
Genentech



More information about the Bioconductor mailing list