[BioC] FindOverlaps Problem

Vincent Carey stvjc at channing.harvard.edu
Sat Jan 30 22:33:33 CET 2010


ah -- my result was with devel, and in fact it does not agree with
yours, but it seems correct in this instance.

> sessionInfo()
R version 2.11.0 Under development (unstable) (2010-01-07 r50940)
i386-apple-darwin9.8.0

locale:
[1] C

attached base packages:
[1] grid      stats     graphics  grDevices datasets  tools     utils
[8] methods   base

other attached packages:
 [1] GenomeGraphs_1.7.1  biomaRt_2.3.0       leeBamSet_0.0.8
 [4] Rsamtools_0.1.24    BSgenome_1.15.4     Biostrings_2.15.18
 [7] IRanges_1.5.31      org.Sc.sgd.db_2.3.5 RSQLite_0.7-3
[10] DBI_0.2-4           AnnotationDbi_1.9.0 Biobase_2.7.0
[13] weaver_1.13.0       codetools_0.2-2     digest_0.4.1

loaded via a namespace (and not attached):
[1] RCurl_1.3-0 XML_2.6-0


On Sat, Jan 30, 2010 at 4:30 PM, Vincent Carey
<stvjc at channing.harvard.edu> wrote:
> It seems to me that it is working correctly but you can't assume that
> the order of ranges at time of construction serves as the order in the
> ultimate object.  A lexicographic ordering by space names is used.
> Challenging to interpret but if you look at the values of a and b
> before interpreting your findOverlaps result it starts to make sense.
>
>> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3,
> 7), sep=""))
>> a
> RangedData with 7 rows and 0 value columns across 7 spaces
>        space    ranges |
>  <character> <IRanges> |
> 1       chr10   [1, 10] |
> 2        chr2   [1, 10] |
> 3        chr3   [1, 10] |
> 4        chr5   [1, 10] |
> 5        chr6   [1, 10] |
> 6        chr7   [1, 10] |
> 7        chr9   [1, 10] |
>> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3
> , "X"), sep=""))
>> b
> RangedData with 7 rows and 0 value columns across 7 spaces
>        space    ranges |
>  <character> <IRanges> |
> 1       chr10   [1, 10] |
> 2       chr18   [1, 10] |
> 3        chr2   [1, 10] |
> 4       chr21   [1, 10] |
> 5        chr3   [1, 10] |
> 6        chr5   [1, 10] |
> 7        chrX   [1, 10] |
>> findOverlaps(a,b)
> RangesMatchingList of length 7
> names(7): chr10 chr2 chr3 chr5 chr6 chr7 chr9
>> as.matrix(.Last.value)
>     query subject
> [1,]     1       1
> [2,]     2       3
> [3,]     3       5
> [4,]     4       6
>
>
> On Sat, Jan 30, 2010 at 1:09 PM, Wu, Xiwei <XWu at coh.org> wrote:
>> Michael,
>>
>> Here is one example. Please let me know if I have missed anything. Thanks.
>>
>>> a <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 9, 5, 6, 3, 7), sep=""))
>>> b <- RangedData(IRanges(start=rep(1,7), end=rep(10,7)), space=paste("chr", c(2, 10, 18, 5, 21, 3, "X"), sep=""))
>>> as.matrix(findOverlaps(a, a))
>>     query subject
>> [1,]     1       1
>> [2,]     2       2
>> [3,]     3       3
>> [4,]     4       4
>> [5,]     5       5
>> [6,]     6       6
>> [7,]     7       7
>>> as.matrix(findOverlaps(a, b))
>>     query subject
>> [1,]     1       1
>> [2,]     2       3
>> [3,]     3       5
>> [4,]     4       5
>>> a[4]
>> RangedData with 1 row and 0 value columns across 1 space
>>        space    ranges |
>>  <character> <IRanges> |
>> 1        chr5   [1, 10] |
>>> b[5]
>> RangedData with 1 row and 0 value columns across 1 space
>>        space    ranges |
>>  <character> <IRanges> |
>> 1        chr3   [1, 10] |
>>> sessionInfo()
>> R version 2.10.0 (2009-10-26)
>> x86_64-unknown-linux-gnu
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=C              LC_MESSAGES=C
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] rtracklayer_1.6.0                  RCurl_1.3-1
>> [3] bitops_1.0-4.1                     BSgenome.Hsapiens.UCSC.hg18_1.3.15
>> [5] ShortRead_1.4.0                    lattice_0.17-26
>> [7] BSgenome_1.14.0                    Biostrings_2.14.0
>> [9] IRanges_1.4.0
>>
>> loaded via a namespace (and not attached):
>> [1] Biobase_2.6.0 grid_2.10.0   hwriter_1.1   tools_2.10.0  XML_2.6-0
>>
>>
>> Xiwei
>> ________________________________________
>> From: Michael Lawrence [mailto:lawrence.michael at gene.com]
>> Sent: Friday, January 29, 2010 1:13 PM
>> To: Wu, Xiwei
>> Cc: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] FindOverlaps Problem
>>
>> Need input... what is the expected result? and what actually happened? sessionInfo()...
>> On Fri, Jan 29, 2010 at 11:47 AM, Wu, Xiwei <XWu at coh.org> wrote:
>> Dear all,
>>
>> I found that the findOverlaps function does not work properly if the
>> space levels do not match exactly between subject and query. Has anyone
>> noticed the same problem? I am using the R-2.10.0 and IRanges-1.4.0. Is
>> this problem being fixed in the developmental version?
>>
>> Thanks.
>>
>> Xiwei
>>
>>
>> ---------------------------------------------------------------------
>> SECURITY/CONFIDENTIALITY WARNING:  \ This message and ...{{dropped:10}}
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>



More information about the Bioconductor mailing list