[BioC] GRanges, strange behavior when chromosomes don't match

Cei Abreu-Goodger cei at ebi.ac.uk
Fri Jul 9 23:52:16 CEST 2010


Thanks Patrick!

Patrick Aboyoun wrote:
> Cei,
> I checked in a patch to BioC 2.6 (GenomicRanges 1.0.6) and BioC 2.7 
> (GenomicRanges 1.1.16) that fixes the issue you found in the intersect 
> and setdiff methods for GRanges objects when the two objects don't have 
> the same sequence names. These new packages will be available on 
> bioconductor.org within the next 36 hours.
> 
> The intersect and setdiff method for GRanges now form the union of 
> sequence names for both objects before performing their operations. The 
> output for the examples you sent are now
> 
>  > library(GenomicRanges)
>  > gr1 <- GRanges(seqnames = c("chr1", "chr2", "chr3", "chr4"),
> +           ranges = IRanges(1:4, width = 10),
> +           strand = c("-", "+", "+", "+"))
>  > gr2 <- GRanges(seqnames = c("chr1", "chr2", "chr5", "chr6"),
> +           ranges = IRanges(1:4, width = 10),
> +           strand = c("-", "+", "+", "+"))
> 
>  > intersect(gr1,gr2)
> GRanges with 2 ranges and 0 elementMetadata values
>     seqnames    ranges strand |
> <Rle> <IRanges> <Rle> |
> [1]     chr1   [1, 10]      - |
> [2]     chr2   [2, 11]      + |
> 
> seqlengths
>  chr1 chr2 chr3 chr4 chr5 chr6
>    NA   NA   NA   NA   NA   NA
> 
>  > setdiff(gr1,gr2)
> GRanges with 2 ranges and 0 elementMetadata values
>     seqnames    ranges strand |
> <Rle> <IRanges> <Rle> |
> [1]     chr3   [3, 12]      + |
> [2]     chr4   [4, 13]      + |
> 
> seqlengths
>  chr1 chr2 chr3 chr4 chr5 chr6
>    NA   NA   NA   NA   NA   NA
> 
>  > sessionInfo()
> R version 2.12.0 Under development (unstable) (2010-07-01 r52425)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
> 
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> 
> other attached packages:
> [1] GenomicRanges_1.1.16 IRanges_1.7.10
> 
> 
> Cheers,
> Patrick
> 
> 
> 
> On 7/8/10 4:55 PM, Patrick Aboyoun wrote:
>> Cei,
>> Thanks for the bug report. I am looking into the issue now and will 
>> have a patch out shortly.
>>
>>
>> Cheers,
>> Patrick
>>
>>
>> On 7/8/10 9:22 AM, Cei Abreu-Goodger wrote:
>>> Hello all,
>>>
>>> When you have two GRanges objects which don't have the same seqnames 
>>> factor, you can get very unexpected behavior.
>>>
>>> Would it be possible to get an error/warning when this is attempted? 
>>> I don't know how many functions are affected, perhaps the functions 
>>> mentioned below can be modified to produce a more obvious result...
>>>
>>> Many thanks,
>>>
>>> Cei
>>>
>>>
>>> Code example / sessionInfo:
>>>
>>> library(GenomicRanges)
>>> gr1 <- GRanges(seqnames = c("chr1", "chr2", "chr3", "chr4"),
>>>           ranges = IRanges(1:4, width = 10),
>>>           strand = c("-", "+", "+", "+"))
>>> gr2 <- GRanges(seqnames = c("chr1", "chr2", "chr5", "chr6"),
>>>           ranges = IRanges(1:4, width = 10),
>>>           strand = c("-", "+", "+", "+"))
>>>
>>> union(gr1,gr2)
>>> # gives expected result, all 6 ranges
>>>
>>> intersect(gr1,gr2)
>>> # includes ranges from the missing gr2 chromosomes
>>>
>>> setdiff(gr1,gr2)
>>> # includes 3 copies of the ranges from the missing gr2 chromosomes
>>>
>>>
>>> sessionInfo()
>>> R version 2.11.0 (2010-04-22)
>>> i386-apple-darwin9.8.0
>>>
>>> locale:
>>> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices datasets  utils     methods   base
>>>
>>> other attached packages:
>>> [1] GenomicRanges_1.0.5 IRanges_1.6.8       Biobase_2.8.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.11.0
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: 
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list