[BioC] FindOverlaps Problem

Vincent Carey stvjc at channing.harvard.edu
Sat Jan 30 22:30:12 CET 2010


It seems to me that it is working correctly but you can't assume that
the order of ranges at time of construction serves as the order in the
ultimate object.  A lexicographic ordering by space names is used.
Challenging to interpret but if you look at the values of a and b
before interpreting your findOverlaps result it starts to make sense.

> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3,
7), sep=""))
> a
RangedData with 7 rows and 0 value columns across 7 spaces
        space    ranges |
  <character> <IRanges> |
1       chr10   [1, 10] |
2        chr2   [1, 10] |
3        chr3   [1, 10] |
4        chr5   [1, 10] |
5        chr6   [1, 10] |
6        chr7   [1, 10] |
7        chr9   [1, 10] |
> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3
, "X"), sep=""))
> b
RangedData with 7 rows and 0 value columns across 7 spaces
        space    ranges |
  <character> <IRanges> |
1       chr10   [1, 10] |
2       chr18   [1, 10] |
3        chr2   [1, 10] |
4       chr21   [1, 10] |
5        chr3   [1, 10] |
6        chr5   [1, 10] |
7        chrX   [1, 10] |
> findOverlaps(a,b)
RangesMatchingList of length 7
names(7): chr10 chr2 chr3 chr5 chr6 chr7 chr9
> as.matrix(.Last.value)
     query subject
[1,]     1       1
[2,]     2       3
[3,]     3       5
[4,]     4       6


On Sat, Jan 30, 2010 at 1:09 PM, Wu, Xiwei <XWu at coh.org> wrote:
> Michael,
>
> Here is one example. Please let me know if I have missed anything. Thanks.
>
>> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3, 7), sep=""))
>> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3, "X"), sep=""))
>> as.matrix(findOverlaps(a, a))
>     query subject
> [1,]     1       1
> [2,]     2       2
> [3,]     3       3
> [4,]     4       4
> [5,]     5       5
> [6,]     6       6
> [7,]     7       7
>> as.matrix(findOverlaps(a, b))
>     query subject
> [1,]     1       1
> [2,]     2       3
> [3,]     3       5
> [4,]     4       5
>> a[4]
> RangedData with 1 row and 0 value columns across 1 space
>        space    ranges |
>  <character> <IRanges> |
> 1        chr5   [1, 10] |
>> b[5]
> RangedData with 1 row and 0 value columns across 1 space
>        space    ranges |
>  <character> <IRanges> |
> 1        chr3   [1, 10] |
>> sessionInfo()
> R version 2.10.0 (2009-10-26)
> x86_64-unknown-linux-gnu
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=C              LC_MESSAGES=C
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] rtracklayer_1.6.0                  RCurl_1.3-1
> [3] bitops_1.0-4.1                     BSgenome.Hsapiens.UCSC.hg18_1.3.15
> [5] ShortRead_1.4.0                    lattice_0.17-26
> [7] BSgenome_1.14.0                    Biostrings_2.14.0
> [9] IRanges_1.4.0
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.6.0 grid_2.10.0   hwriter_1.1   tools_2.10.0  XML_2.6-0
>
>
> Xiwei
> ________________________________________
> From: Michael Lawrence [mailto:lawrence.michael at gene.com]
> Sent: Friday, January 29, 2010 1:13 PM
> To: Wu, Xiwei
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] FindOverlaps Problem
>
> Need input... what is the expected result? and what actually happened? sessionInfo()...
> On Fri, Jan 29, 2010 at 11:47 AM, Wu, Xiwei <XWu at coh.org> wrote:
> Dear all,
>
> I found that the findOverlaps function does not work properly if the
> space levels do not match exactly between subject and query. Has anyone
> noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is
> this problem being fixed in the developmental version?
>
> Thanks.
>
> Xiwei
>
>
> ---------------------------------------------------------------------
> SECURITY/CONFIDENTIALITY WARNING:  \ This message and ...{{dropped:10}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list