[BioC] shift function (GenomicRanges, IRanges), issue with negative start values and circular DNA

Ivanek, Robert robert.ivanek at fmi.ch
Thu Sep 1 17:23:11 CEST 2011


Hi Michael,

is there any plan to fixed these issues?

Best Regards

Robert

On 09/01/2011 03:37 PM, Michael Lawrence wrote:
> Nevermind, I misunderstood what you were trying to do.
>
> On Thu, Sep 1, 2011 at 6:36 AM, Michael Lawrence <michafla at gene.com
> <mailto:michafla at gene.com>> wrote:
>
>     As a sort of side-comment, you might be better off doing this with
>     resize(), i.e.,
>
>     resize(gr, width(gr) - 50, fix = "end")
>
>     Michael
>
>     On Thu, Sep 1, 2011 at 5:27 AM, Ivanek, Robert <robert.ivanek at fmi.ch
>     <mailto:robert.ivanek at fmi.ch>> wrote:
>
>         Dear Bioc developers,
>
>         I am analysing the ChIP-Seq data using the GenomicRanges
>         package. I am usually shifting the reads to the middle of the
>         fragment (i.e. by 50 bp toward 3' end of the fragment according
>         the strand of the read), using the shift function (from
>         IRanges), which works in most cases.
>
>         shift(gr, shift=50 * as.integer(ifelse(strand(gr)==__"-", -1, 1)))
>
>         However sometimes the result does not look like as I would expect:
>
>         1. If the read is on the minus strand and the start value is
>         smaller than the shift value, it produces an error:
>
>         ## Error in validObject(x) :
>         ## invalid class “IRanges” object: 'widths(x)' cannot contain
>         negative values
>
>         If the read is on the plus strand at the end of the chromosome
>         it returns the size of the chromosome as a new end value. Would
>         be possible to apply similar strategy for the beginning of the
>         chromosome (It would return 1 as a start value)?
>
>
>
>         2. The shift function does not take into account information
>         about circularity of the chromosome. Is there any plan to
>         include it?
>
>
>         Please see below reproducible examples
>
>         Best Regards
>
>         Robert
>
>         --
>         Robert Ivanek
>         Postdoctoral Fellow Schuebeler Group
>         Friedrich Miescher Institute
>         Maulbeerstrasse 66
>         4058 Basel / Switzerland
>         Office phone: +41 61 697 6100 <tel:%2B41%2061%20697%206100>
>
>
>
>
>         EXAMPLES:
>
>
>         ##
>         ##############################__##############################__#################
>         ##
>         library("GenomicRanges")
>         ## Loading required package: IRanges
>         ## Attaching package: ‘IRanges’
>         ## The following object(s) are masked from ‘package:base’:
>         ## cbind, eval, intersect, Map, mapply, order, paste, pmax,
>         pmax.int <http://pmax.int>, pmin, pmin.int <http://pmin.int>,
>         rbind, rep.int <http://rep.int>, setdiff, table, union
>
>         ##
>         ##
>         ##############################__##############################__#################
>         ##
>         chr1.gr <http://chr1.gr> <- GRanges(seqnames=Rle("chr1", 4),
>         ranges=IRanges(start=rep(c(1,__197195432),each=2), width=1),
>         strand=rep(c("+","-"),2),
>         seqlengths=c(chr1=197195432))
>         ##
>
>         isCircular(chr1.gr <http://chr1.gr>) <- FALSE
>         ##
>
>         chr1.gr <http://chr1.gr>
>         ## GRanges with 4 ranges and 0 elementMetadata values:
>         ## seqnames ranges strand
>         ## <Rle> <IRanges> <Rle>
>         ## [1] chr1 [ 1, 1] +
>         ## [2] chr1 [ 1, 1] -
>         ## [3] chr1 [197195432, 197195432] +
>         ## [4] chr1 [197195432, 197195432] -
>         ## ---
>         ## seqlengths:
>         ## chr1
>         ## 197195432
>         ##
>
>         seqinfo(chr1.gr <http://chr1.gr>)
>         ## Seqinfo of length 1
>         ## seqnames seqlengths isCircular
>         ## chr1 197195432 FALSE
>         ##
>
>         shift(chr1.gr <http://chr1.gr>, shift=50)
>         ## GRanges with 4 ranges and 0 elementMetadata values:
>         ## seqnames ranges strand
>         ## <Rle> <IRanges> <Rle>
>         ## [1] chr1 [ 51, 51] +
>         ## [2] chr1 [ 51, 51] -
>         ## [3] chr1 [197195432, 197195432] +
>         ## [4] chr1 [197195432, 197195432] -
>         ## ---
>         ## seqlengths:
>         ## chr1
>         ## 197195432
>         ## Warning message:
>         ## In `end<-`(`*tmp*`, check = FALSE, value = c(51L, 51L,
>         197195482L, :
>         ## trimmed end values to be <= seqlengths
>         ##
>
>         shift(chr1.gr <http://chr1.gr>, shift=50 *
>         as.integer(ifelse(strand(chr1.__gr <http://chr1.gr>)=="-", -1, 1)))
>         ## Error in validObject(x) :
>         ## invalid class “IRanges” object: 'widths(x)' cannot contain
>         negative values
>         ## In addition: Warning messages:
>         ## 1: In `end<-`(`*tmp*`, check = FALSE, value = c(51L, -49L,
>         197195482L, :
>         ## trimmed end values to be <= seqlengths
>         ## 2: In `start<-`(`*tmp*`, value = c(51L, -49L, 197195432L,
>         197195382L :
>         ## trimmed start values to be positive
>         ##
>         ##
>         ##############################__##############################__#################
>         ##
>         chrM.gr <- GRanges(seqnames=Rle("chrM", 4),
>         ranges=IRanges(start=rep(c(1,__16299),each=2), width=1),
>         strand=rep(c("+","-"),2),
>         seqlengths=c(chrM=16299))
>         ##
>
>         isCircular(chrM.gr) <- TRUE
>         ##
>
>         chrM.gr
>         ## GRanges with 4 ranges and 0 elementMetadata values:
>         ## seqnames ranges strand
>         ## <Rle> <IRanges> <Rle>
>         ## [1] chrM [ 1, 1] +
>         ## [2] chrM [ 1, 1] -
>         ## [3] chrM [16299, 16299] +
>         ## [4] chrM [16299, 16299] -
>         ## ---
>         ## seqlengths:
>         ## chrM
>         ## 16299
>         ##
>
>         seqinfo(chrM.gr)
>         ## Seqinfo of length 1
>         ## seqnames seqlengths isCircular
>         ## chrM 16299 TRUE
>         ##
>
>         shift(chrM.gr, shift=50)
>         ## GRanges with 4 ranges and 0 elementMetadata values:
>         ## seqnames ranges strand
>         ## <Rle> <IRanges> <Rle>
>         ## [1] chrM [ 51, 51] +
>         ## [2] chrM [ 51, 51] -
>         ## [3] chrM [16299, 16299] +
>         ## [4] chrM [16299, 16299] -
>         ## ---
>         ## seqlengths:
>         ## chrM
>         ## 16299
>         ## Warning message:
>         ## In `end<-`(`*tmp*`, check = FALSE, value = c(51L, 51L,
>         16349L, 16349L :
>         ## trimmed end values to be <= seqlengths
>         ##
>
>         shift(chrM.gr, shift=50 *
>         as.integer(ifelse(strand(chrM.__gr)=="-", -1, 1)))
>         ## Error in validObject(x) :
>         ## invalid class “IRanges” object: 'widths(x)' cannot contain
>         negative values
>         ## In addition: Warning messages:
>         ## 1: In `end<-`(`*tmp*`, check = FALSE, value = c(51L, -49L,
>         16349L, :
>         ## trimmed end values to be <= seqlengths
>         ## 2: In `start<-`(`*tmp*`, value = c(51L, -49L, 16299L, 16249L)) :
>         ## trimmed start values to be positive
>         ##
>         ##
>         ##
>         ##############################__##############################__#################
>         ##
>         sessionInfo()
>         ## R Under development (unstable) (2011-08-29 r56824)
>         ## Platform: x86_64-unknown-linux-gnu (64-bit)
>
>         ## locale:
>         ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
>         LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
>         ## [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C LC_ADDRESS=C
>         LC_TELEPHONE=C
>         ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
>         ## attached base packages:
>         ## [1] stats graphics grDevices utils datasets methods base
>
>         ## other attached packages:
>         ## [1] GenomicRanges_1.5.31 IRanges_1.11.26
>         ##
>         ##
>         ##############################__##############################__#################
>
>
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>

-- 
Robert Ivanek
Postdoctoral Fellow Schuebeler Group
Friedrich Miescher Institute
Maulbeerstrasse 66
4058 Basel / Switzerland
Office phone: +41 61 697 6100



More information about the Bioconductor mailing list