[BioC] multiple hits with countOverlaps function

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Thu Apr 14 03:00:51 CEST 2011


On Wed, Apr 13, 2011 at 8:46 PM, Wei Shi <shi at wehi.edu.au> wrote:
> The point is that the second read ([600, 700]) overlaps with both features and it was counted by both features. So the first feature ([100, 1000]) counts both reads but the second feature ([500, 1500] ) counts the second read again. Therefore, the second read was counted twice. In other words, there are only two reads in this example, but the total number of counts output from countOverlaps is three.

Yes, and I think this is entirely to be expected.  In all my
use-cases, this is exactly what I want.

I dont get the "the second read was counted twice. "  It is the nature
of the problem that reads have length > 1 and they can overlap
multiple features and you need to thing about how you want to deal
with this.  I assume you are looking at HTseq data, and I cannot
really understand what you are trying to do.

Kasper



More information about the Bioconductor mailing list