[BioC] GenomicRanges Use Cases - subsetByOverlaps

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Nov 8 15:30:54 CET 2011


On Tue, Nov 8, 2011 at 4:27 AM, James Perkins <j.perkins at ucl.ac.uk> wrote:
> Hi,
> I am having some problems following the example in the vignette for
> GenomicRanges, specifically:
> 3.4 Identifying reads that do NOT overlap known annotation
> ...
>> filtData <- subsetByOverlaps(aligns, exonRanges)
>> length(filtData)
> [1] 17311
> At this point, the filtData object only contains ranges that did not
> overlap with any of the known exons from Saccharomycess cerevisiae.
> My understanding of subsetByOverlaps is that it would bring back
> exactly the ranges that DO overlap with the known exons?
> 'subsetByOverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type =
>          c("any", "start", "end", "within", "equal"))': Returns the
>          subset of 'query' that has an overlap hit with a range in
>          'subject' using the specified 'findOverlaps' parameters.
>          Both 'query' and 'subject' should be 'Ranges', 'RangesList'
>          or 'RangedData' objects.
> I don't see how this gets the reads mapping in non-exon ranges. Surely
> it gets the reads mapping in the exon ranges? since exonRanges is
> obtained using:
> exonRanges <- exonsBy(txdb, "tx")
> Shouldn't I be looking for the subset that *doesn't* overlap?
> Something like subsetByOverlaps(! aligns, exonRanges)? Or have I
> missed something obvious (quite likely!)?

One thing you can do is call `gaps` on your exonRanges to get the
regions where reads hit the "gaps" between exons:

R> not.exons <- subsetByOverlaps(aligns, gaps(exonRanges))

This will still return reads that partially overlap both exonic and
not exonic regions.

You can also do `! ... %in% ...`:

R> not.exons <- aligns[!aligns %in% exonRanges]

This will (should) only return reads that don't overlap with any
`exonRanges` at all.


Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

More information about the Bioconductor mailing list