[BioC] GenomicRanges request - enable ignore.strand for findOverlaps comparing query with itself?

Janet Young jayoung at fhcrc.org
Tue Jun 10 22:57:02 CEST 2014

Hi there,

I have a request for findOverlaps (GenomicRanges) - hopefully it's an easy one. 

Is it possible to implement the ignore.strand options for findOverlaps calls where we're comparing a query GRanges with itself?   The reason I ask is that I'm looking through a set of genes to find pairs that overlap on opposite strands. Below is some code that should explain it (I'm using the devel packages).

thanks very much,


##### GRanges


## an example GRanges object, taken from the findOverlaps-methods {GenomicRanges} help page:
gr <-
  GRanges(seqnames =
          Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
          ranges =
          IRanges(1:10, width = 10:1, names = head(letters,10)),
          strand =
          Rle(strand(c("-", "+", "*", "+", "-")),
              c(1, 2, 2, 3, 2)),
          score = 1:10,
          GC = seq(1, 0, length=10))

## findOverlaps works, of course, finds 24 hits

## ignoreSelf and ignoreRedundant are useful: gives me just 7 useful pairs to explore more:
findOverlaps(gr, ignoreSelf=TRUE, ignoreRedundant=TRUE)

## but I'm not getting hits for overlaps on opposite strands - I'd like to use ignore.strand, but it only works if we supply both query and subject.  When I suppy gr as both query and subject, I get 34 pairs ignoring the strand:
findOverlaps(gr, ignore.strand=TRUE)
# Error in .local(query, subject, maxgap, minoverlap, type, select, ...) : 
#   unused argument (ignore.strand = TRUE)
findOverlaps(gr, gr, ignore.strand=TRUE)

## but now that I'm supplying the subject, I can't use the other two useful options (ignoreSelf and ignoreRedundant) that help me quickly get the pairs I'd like to explore more 
findOverlaps(gr, gr, ignore.strand=TRUE, ignoreSelf=TRUE, ignoreRedundant=TRUE)
# Error in .local(query, subject, maxgap, minoverlap, type, select, ...) : 
#  unused arguments (ignoreSelf = TRUE, ignoreRedundant = TRUE)


R version 3.1.0 Patched (2014-05-26 r65771)
Platform: x86_64-unknown-linux-gnu (64-bit)

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GenomicRanges_1.17.17 GenomeInfoDb_1.1.6    IRanges_1.99.15      
[4] S4Vectors_0.0.8       BiocGenerics_0.11.2  

loaded via a namespace (and not attached):
[1] stats4_3.1.0  XVector_0.5.6

