[BioC] GenomicRanges::nearest (especially follow)

Harris A. Jaffee hj at jhu.edu
Thu Oct 11 23:27:38 CEST 2012


Apologies in advance if I just don't understand the ignore.strand switch, or
perhaps GRanges objects.  Also, I have not even tried to understand

	GenomicRanges:::.GenomicRanges_findPrecedeFollow

I have the impression that a query with strand entirely "*" essentially implies
ignore.strand, but this case here is handled correctly only if ignore.strand is
TRUE (consistent with distanceToNearest but not distance, which seems correct):

> x = GRanges(ranges=IRanges(start=5, end=5), seqnames="chr1", strand="*")
> Y = GRanges(ranges=IRanges(start=c(6,7), end=c(6,7)), seqnames="chr1", strand=c("+","-"))
> distance(x, Y[1])
[1] 1
> distance(x, Y[2])
[1] 2

> nearest(x, Y, ignore.strand=TRUE)	# correct
[1] 1
> nearest(ranges(x), ranges(Y))       	# also correct            
[1] 1

However,

> nearest(x, Y)
[1] 2

> distanceToNearest(x, Y) 
DataFrame with 1 row and 3 columns
 queryHits subjectHits  distance
 <integer>   <integer> <integer>
1         1           2         2

> distanceToNearest(x, Y, ignore.strand=TRUE) 
DataFrame with 1 row and 3 columns
  queryHits subjectHits  distance
  <integer>   <integer> <integer>
1         1           1         1


Finally,

> follow(ranges(x), ranges(Y))
[1] NA

The issue (along with GenomicRanges:::.nearest) must come down to this:

> follow(x, Y)				# how can this be?
[1] 2

whereas

> follow(x, Y, ignore.strand=TRUE)	# correct, I think
[1] NA

> sessionInfo() 
R Under development (unstable) (2012-10-10 r60908)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.iso885915       LC_NUMERIC=C                  
 [3] LC_TIME=en_US.iso885915        LC_COLLATE=en_US.iso885915    
 [5] LC_MONETARY=en_US.iso885915    LC_MESSAGES=en_US.iso885915   
 [7] LC_PAPER=C                     LC_NAME=C                     
 [9] LC_ADDRESS=C                   LC_TELEPHONE=C                
[11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C           

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] GenomicRanges_1.11.0 IRanges_1.17.0       BiocGenerics_0.5.0  

loaded via a namespace (and not attached):
[1] parallel_2.16.0 stats4_2.16.0   tools_2.16.0   



More information about the Bioconductor mailing list