[BioC] bug: subsetByOverlaps(GRangesList, GRanges)

Patrick Aboyoun paboyoun at fhcrc.org
Fri Jul 16 18:26:39 CEST 2010


Cei,
The Details section of the relevant man page for subsetByOverlaps 
explains its functionality

help("subsetByOverlaps,GRangesList,GRanges-method")

In short the subsetByOverlaps function operates at the object's "top" 
level, not the within-element level, as the "[" operator behaves for a 
standard R list. So in the GRangesList, GRanges case, either you select 
the list element as it appears in the original object or you drop it 
entirely.

I am glad to see that you have mentioned using the endoapply function, 
because that is exactly what I would have recommended.

endoapply(grl, subsetByOverlaps, gr3)

If your use case, however, involves a GRangesList with over a hundred 
elements, however, this may not be performant enough and I can provide 
you with lower level code that will be much faster. If this is a common 
use case, we could add a new function that works for 
IRangesList,IRanges; GRangesList,GRanges; etc. pairings and avoids the 
endoapply framework.


Patrick



On 7/16/10 4:01 AM, Cei Abreu-Goodger wrote:
> Hello,
>
> I think I've found another bug. If you use subsetByOverlaps with a 
> GRangesList as query, the full object is returned, instead of the 
> subset that overlaps:
>
> library(GenomicRanges)
> gr1 <- GRanges(seqnames=c("a","b"),ranges=IRanges(c(1,11), c(5,15)))
> gr2 <- GRanges(seqnames=c("a","b"),ranges=IRanges(c(1,11), c(5,15)))
> gr3 <- GRanges(seqnames=c("a"),ranges=IRanges(1,5))
> grl <- GRangesList(gr1,gr2)
>
> identical(grl,subsetByOverlaps(grl, gr3))
> [1] TRUE
>
> To get the behavior that I was expecting, you can do:
> endoapply(grl, subsetByOverlaps, gr3)
>
> Cheers,
>
> Cei
>
> > sessionInfo()
> R version 2.11.0 (2010-04-22)
> i386-apple-darwin9.8.0
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods   base
>
> other attached packages:
> [1] GenomicRanges_1.0.6 IRanges_1.6.8       Biobase_2.8.0
>
> loaded via a namespace (and not attached):
> [1] tools_2.11.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list