[BioC] Order within a GRanges object

Cook, Malcolm MEC at stowers.org
Tue Aug 20 15:05:35 CEST 2013


>Hello,
 >
 >I have some points according to the internal order of granges objects.
 >
 >1) Automatically there is an order depending on the a) seqnames (=
 >chromosomes) and b) the ranges.

no!   There is no gaurantee on the order.

> library(GenomicRanges)
> example(GRanges)
...
> longGR
GRanges with 30 ranges and 1 metadata column:
      seqnames     ranges strand   |     score
         <Rle>  <IRanges>  <Rle>   | <integer>
    a     chr1    [1, 10]      -   |         1
    b     chr2    [2, 10]      +   |         2
    c     chr2    [3, 10]      +   |         3
    d     chr2    [4, 10]      *   |         4
    e     chr1    [5, 10]      *   |         5
  ...      ...        ...    ... ...       ...
          chr2 [106, 115]      -   |        26
          chr2 [107, 116]      -   |        27
          chr3 [108, 117]      -   |        28
          chr3 [109, 118]      -   |        29
          chr3 [110, 119]      -   |        30
  ---
  seqlengths:
   chr1 chr2 chr3
   1000 2000 1500
>  rev(longGR)
GRanges with 30 ranges and 1 metadata column:
      seqnames     ranges strand   |     score
         <Rle>  <IRanges>  <Rle>   | <integer>
          chr3 [110, 119]      -   |        30
          chr3 [109, 118]      -   |        29
          chr3 [108, 117]      -   |        28
          chr2 [107, 116]      -   |        27
          chr2 [106, 115]      -   |        26
  ...      ...        ...    ... ...       ...
    e     chr1    [5, 10]      *   |         5
    d     chr2    [4, 10]      *   |         4
    c     chr2    [3, 10]      +   |         3
    b     chr2    [2, 10]      +   |         2
    a     chr1    [1, 10]      -   |         1
  ---
  seqlengths:
   chr1 chr2 chr3
   1000 2000 1500
>

 >
 >2) The seqnames are always sorted in ascii order.

No!  but they _can_ be:

> sort(longGR)
GRanges with 30 ranges and 1 metadata column:
      seqnames     ranges strand   |     score
         <Rle>  <IRanges>  <Rle>   | <integer>
    f     chr1    [6, 10]      +   |         6
          chr1    [1,  5]      -   |       101
    a     chr1    [1, 10]      -   |         1
          chr1    [2,  6]      -   |       102
          chr1    [3,  7]      -   |       103
  ...      ...        ...    ... ...       ...
    j     chr3 [ 10,  10]      -   |        10
          chr3 [ 10,  14]      -   |       110
          chr3 [108, 117]      -   |        28
          chr3 [109, 118]      -   |        29
          chr3 [110, 119]      -   |        30
  ---
  seqlengths:
   chr1 chr2 chr3
   1000 2000 1500


~ Malcolm Cook


 >
 >3) After
 >    df <- as.data.frame
 >    m <- regexpr ("\\d+", df$seqnames, perl=TRUE)
 >    df$Chromosome <- regmatches (df$seqnames, m)
 >    df$Chromosome <- as.integer (as.character (df$Chromosome))
 >    df <- df [order(df$Chromosome),]
 >    only the order of the chromosomes is changed. The order of the ranges
 >(now df$start and df$end) is still the same.
 >
 >Are my assumptions true?
 >
 >Thanks Hermann
 >
 >	[[alternative HTML version deleted]]
 >
 >_______________________________________________
 >Bioconductor mailing list
 >Bioconductor at r-project.org
 >https://stat.ethz.ch/mailman/listinfo/bioconductor
 >Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list