[BioC] GenomicRanges Use Cases - subsetByOverlaps
mailinglist.honeypot at gmail.com
Tue Nov 8 15:30:54 CET 2011
On Tue, Nov 8, 2011 at 4:27 AM, James Perkins <j.perkins at ucl.ac.uk> wrote:
> I am having some problems following the example in the vignette for
> GenomicRanges, specifically:
> 3.4 Identifying reads that do NOT overlap known annotation
>> filtData <- subsetByOverlaps(aligns, exonRanges)
>  17311
> At this point, the filtData object only contains ranges that did not
> overlap with any of the known exons from Saccharomycess cerevisiae.
> My understanding of subsetByOverlaps is that it would bring back
> exactly the ranges that DO overlap with the known exons?
> 'subsetByOverlaps(query, subject, maxgap = 0L, minoverlap = 1L, type =
> c("any", "start", "end", "within", "equal"))': Returns the
> subset of 'query' that has an overlap hit with a range in
> 'subject' using the specified 'findOverlaps' parameters.
> Both 'query' and 'subject' should be 'Ranges', 'RangesList'
> or 'RangedData' objects.
> I don't see how this gets the reads mapping in non-exon ranges. Surely
> it gets the reads mapping in the exon ranges? since exonRanges is
> obtained using:
> exonRanges <- exonsBy(txdb, "tx")
> Shouldn't I be looking for the subset that *doesn't* overlap?
> Something like subsetByOverlaps(! aligns, exonRanges)? Or have I
> missed something obvious (quite likely!)?
One thing you can do is call `gaps` on your exonRanges to get the
regions where reads hit the "gaps" between exons:
R> not.exons <- subsetByOverlaps(aligns, gaps(exonRanges))
This will still return reads that partially overlap both exonic and
not exonic regions.
You can also do `! ... %in% ...`:
R> not.exons <- aligns[!aligns %in% exonRanges]
This will (should) only return reads that don't overlap with any
`exonRanges` at all.
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor