[BioC] Odd behaviour with renameSeqlevels

Alex Gutteridge alexg at ruggedtextile.com
Wed May 2 13:43:41 CEST 2012


Is this a bug in renameSeqlevels or expected behaviour? Note the weird 
ordering of chromosome names in txbygene (chrX between chr7 and chr8) 
which then results in misnaming when I try to use renameSeqlevels 
(everything after chr7 is off by one). The docs for renameSeqlevels 
aren't explicit in whether the renaming vector has to match the ordering 
of the original names, but I thought the point of making it named vector 
is that it doesn't?

> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
Loading required package: GenomicFeatures
Loading required package: BiocGenerics

Attaching package: ‘BiocGenerics’

The following object(s) are masked from ‘package:stats’:

     xtabs

The following object(s) are masked from ‘package:base’:

     anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
     get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
     pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
     rownames, sapply, setdiff, table, tapply, union, unique

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

     Vignettes contain introductory material; view with
     'browseVignettes()'. To cite Bioconductor, see
     'citation("Biobase")', and for packages 'citation("pkgname")'.

> txdb = TxDb.Hsapiens.UCSC.hg19.knownGene
> txbygene = transcriptsBy(txdb,"gene")
> tx = 
> renameSeqlevels(txbygene,c("chr1"="1","chr2"="2","chr3"="3","chr4"="4",
+                                 
"chr5"="5","chr6"="6","chr7"="7","chr8"="8",
+                                 
"chr9"="9","chr10"="10","chr11"="11","chr12"="12",
+                                 
"chr13"="13","chr14"="14","chr15"="15","chr16"="16",
+                                 
"chr17"="17","chr18"="18","chr19"="19","chr20"="20",
+                                 
"chr21"="21","chr22"="22","chrX"="X"))
> seqlevels(txbygene)
  [1] "chr1"                  "chr2"                  "chr3"
  [4] "chr4"                  "chr5"                  "chr6"
  [7] "chr7"                  "chrX"                  "chr8"
[10] "chr9"                  "chr10"                 "chr11"
[13] "chr12"                 "chr13"                 "chr14"
[16] "chr15"                 "chr16"                 "chr17"
[19] "chr18"                 "chr20"                 "chrY"
[22] "chr19"                 "chr22"                 "chr21"
[25] "chr6_ssto_hap7"        "chr6_mcf_hap5"         "chr6_cox_hap2"
[28] "chr6_mann_hap4"        "chr6_apd_hap1"         "chr6_qbl_hap6"
[31] "chr6_dbb_hap3"         "chr17_ctg5_hap1"       "chr4_ctg9_hap1"
[34] "chr1_gl000192_random"  "chrUn_gl000225"        
"chr4_gl000194_random"
[37] "chr4_gl000193_random"  "chr9_gl000200_random"  "chrUn_gl000222"
[40] "chrUn_gl000212"        "chr7_gl000195_random"  "chrUn_gl000223"
[43] "chrUn_gl000224"        "chrUn_gl000219"        
"chr17_gl000205_random"
[46] "chrUn_gl000215"        "chrUn_gl000216"        "chrUn_gl000217"
[49] "chr9_gl000199_random"  "chrUn_gl000211"        "chrUn_gl000213"
[52] "chrUn_gl000220"        "chrUn_gl000218"        
"chr19_gl000209_random"
[55] "chrUn_gl000221"        "chrUn_gl000214"        "chrUn_gl000228"
[58] "chrUn_gl000227"        "chr1_gl000191_random"  
"chr19_gl000208_random"
[61] "chr9_gl000198_random"  "chr17_gl000204_random" "chrUn_gl000233"
[64] "chrUn_gl000237"        "chrUn_gl000230"        "chrUn_gl000242"
[67] "chrUn_gl000243"        "chrUn_gl000241"        "chrUn_gl000236"
[70] "chrUn_gl000240"        "chr17_gl000206_random" "chrUn_gl000232"
[73] "chrUn_gl000234"        "chr11_gl000202_random" "chrUn_gl000238"
[76] "chrUn_gl000244"        "chrUn_gl000248"        
"chr8_gl000196_random"
[79] "chrUn_gl000249"        "chrUn_gl000246"        
"chr17_gl000203_random"
[82] "chr8_gl000197_random"  "chrUn_gl000245"        "chrUn_gl000247"
[85] "chr9_gl000201_random"  "chrUn_gl000235"        "chrUn_gl000239"
[88] "chr21_gl000210_random" "chrUn_gl000231"        "chrUn_gl000229"
[91] "chrM"                  "chrUn_gl000226"        
"chr18_gl000207_random"
> seqlevels(tx)
  [1] "1"                     "2"                     "3"
  [4] "4"                     "5"                     "6"
  [7] "7"                     "8"                     "9"
[10] "10"                    "11"                    "12"
[13] "13"                    "14"                    "15"
[16] "16"                    "17"                    "18"
[19] "19"                    "20"                    "chrY"
[22] "21"                    "22"                    "X"
[25] "chr6_ssto_hap7"        "chr6_mcf_hap5"         "chr6_cox_hap2"
[28] "chr6_mann_hap4"        "chr6_apd_hap1"         "chr6_qbl_hap6"
[31] "chr6_dbb_hap3"         "chr17_ctg5_hap1"       "chr4_ctg9_hap1"
[34] "chr1_gl000192_random"  "chrUn_gl000225"        
"chr4_gl000194_random"
[37] "chr4_gl000193_random"  "chr9_gl000200_random"  "chrUn_gl000222"
[40] "chrUn_gl000212"        "chr7_gl000195_random"  "chrUn_gl000223"
[43] "chrUn_gl000224"        "chrUn_gl000219"        
"chr17_gl000205_random"
[46] "chrUn_gl000215"        "chrUn_gl000216"        "chrUn_gl000217"
[49] "chr9_gl000199_random"  "chrUn_gl000211"        "chrUn_gl000213"
[52] "chrUn_gl000220"        "chrUn_gl000218"        
"chr19_gl000209_random"
[55] "chrUn_gl000221"        "chrUn_gl000214"        "chrUn_gl000228"
[58] "chrUn_gl000227"        "chr1_gl000191_random"  
"chr19_gl000208_random"
[61] "chr9_gl000198_random"  "chr17_gl000204_random" "chrUn_gl000233"
[64] "chrUn_gl000237"        "chrUn_gl000230"        "chrUn_gl000242"
[67] "chrUn_gl000243"        "chrUn_gl000241"        "chrUn_gl000236"
[70] "chrUn_gl000240"        "chr17_gl000206_random" "chrUn_gl000232"
[73] "chrUn_gl000234"        "chr11_gl000202_random" "chrUn_gl000238"
[76] "chrUn_gl000244"        "chrUn_gl000248"        
"chr8_gl000196_random"
[79] "chrUn_gl000249"        "chrUn_gl000246"        
"chr17_gl000203_random"
[82] "chr8_gl000197_random"  "chrUn_gl000245"        "chrUn_gl000247"
[85] "chr9_gl000201_random"  "chrUn_gl000235"        "chrUn_gl000239"
[88] "chr21_gl000210_random" "chrUn_gl000231"        "chrUn_gl000229"
[91] "chrM"                  "chrUn_gl000226"        
"chr18_gl000207_random"
> txbygene$'5327'
GRanges with 6 ranges and 2 elementMetadata cols:
       seqnames               ranges strand |     tx_id     tx_name
          <Rle>            <IRanges>  <Rle> | <integer> <character>
   [1]     chr8 [42032236, 42050729]      - |     31953  uc010lxf.1
   [2]     chr8 [42032236, 42050729]      - |     31954  uc010lxg.1
   [3]     chr8 [42032236, 42065194]      - |     31955  uc003xos.2
   [4]     chr8 [42032236, 42065194]      - |     31956  uc003xot.2
   [5]     chr8 [42032236, 42065194]      - |     31957  uc011lcm.1
   [6]     chr8 [42032236, 42065194]      - |     31958  uc011lcn.1
   ---
   seqlengths:
                     chr1                  chr2 ... 
chr18_gl000207_random
                249250621             243199373 ...                  
4262
> tx$'5327'
GRanges with 6 ranges and 2 elementMetadata cols:
       seqnames               ranges strand |     tx_id     tx_name
          <Rle>            <IRanges>  <Rle> | <integer> <character>
   [1]        9 [42032236, 42050729]      - |     31953  uc010lxf.1
   [2]        9 [42032236, 42050729]      - |     31954  uc010lxg.1
   [3]        9 [42032236, 42065194]      - |     31955  uc003xos.2
   [4]        9 [42032236, 42065194]      - |     31956  uc003xot.2
   [5]        9 [42032236, 42065194]      - |     31957  uc011lcm.1
   [6]        9 [42032236, 42065194]      - |     31958  uc011lcn.1
   ---
   seqlengths:
                        1                     2 ... 
chr18_gl000207_random
                249250621             243199373 ...                  
4262
> txbygene$'1956'
GRanges with 11 ranges and 2 elementMetadata cols:
        seqnames               ranges strand |     tx_id     tx_name
           <Rle>            <IRanges>  <Rle> | <integer> <character>
    [1]     chr7 [55086725, 55224644]      + |     28336  uc003tqh.3
    [2]     chr7 [55086725, 55236328]      + |     28337  uc003tqi.3
    [3]     chr7 [55086725, 55238738]      + |     28338  uc003tqj.3
    [4]     chr7 [55086725, 55270769]      + |     28339  uc022adm.1
    [5]     chr7 [55086725, 55270769]      + |     28340  uc010kzg.2
    [6]     chr7 [55086725, 55275031]      + |     28341  uc003tqk.3
    [7]     chr7 [55086725, 55275031]      + |     28342  uc022adn.1
    [8]     chr7 [55177540, 55275031]      + |     28343  uc011kco.2
    [9]     chr7 [55224226, 55238906]      + |     28345  uc011kcq.1
   [10]     chr7 [55224226, 55238906]      + |     28346  uc011kcp.1
   [11]     chr7 [55248979, 55259567]      + |     28349  uc022ado.1
   ---
   seqlengths:
                     chr1                  chr2 ... 
chr18_gl000207_random
                249250621             243199373 ...                  
4262
> tx$'1956'
GRanges with 11 ranges and 2 elementMetadata cols:
        seqnames               ranges strand |     tx_id     tx_name
           <Rle>            <IRanges>  <Rle> | <integer> <character>
    [1]        7 [55086725, 55224644]      + |     28336  uc003tqh.3
    [2]        7 [55086725, 55236328]      + |     28337  uc003tqi.3
    [3]        7 [55086725, 55238738]      + |     28338  uc003tqj.3
    [4]        7 [55086725, 55270769]      + |     28339  uc022adm.1
    [5]        7 [55086725, 55270769]      + |     28340  uc010kzg.2
    [6]        7 [55086725, 55275031]      + |     28341  uc003tqk.3
    [7]        7 [55086725, 55275031]      + |     28342  uc022adn.1
    [8]        7 [55177540, 55275031]      + |     28343  uc011kco.2
    [9]        7 [55224226, 55238906]      + |     28345  uc011kcq.1
   [10]        7 [55224226, 55238906]      + |     28346  uc011kcp.1
   [11]        7 [55248979, 55259567]      + |     28349  uc022ado.1
   ---
   seqlengths:
                        1                     2 ... 
chr18_gl000207_random
                249250621             243199373 ...                  
4262> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.7.1
[2] GenomicFeatures_1.8.1
[3] AnnotationDbi_1.18.0
[4] Biobase_2.16.0
[5] GenomicRanges_1.8.3
[6] IRanges_1.14.2
[7] BiocGenerics_0.2.0

loaded via a namespace (and not attached):
  [1] biomaRt_2.12.0     Biostrings_2.24.1  bitops_1.0-4.1     
BSgenome_1.24.0
  [5] DBI_0.2-5          RCurl_1.91-1       Rsamtools_1.8.3    
RSQLite_0.11.1
  [9] rtracklayer_1.16.1 stats4_2.15.0      tools_2.15.0       XML_3.9-4
[13] zlibbioc_1.2.0

-- 
Alex Gutteridge



More information about the Bioconductor mailing list