[BioC] known gene => gene symbol for UCSC

James W. MacDonald jmacdon at uw.edu
Tue Apr 2 15:49:29 CEST 2013


Hi Ido,

You don't give sessionInfo() results, but this works for me

 > select(Mus.musculus, "uc009veu.1", "SYMBOL","TXNAME")
       TXNAME SYMBOL
1 uc009veu.1  Zglp1

 > sessionInfo()
R Under development (unstable) (2013-01-22 r61734)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C                 LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
  [1] Mus.musculus_1.1.0
  [2] TxDb.Mmusculus.UCSC.mm10.knownGene_2.9.0
  [3] org.Mm.eg.db_2.9.0
  [4] GO.db_2.9.0
  [5] RSQLite_0.11.2
  [6] DBI_0.2-5
  [7] OrganismDbi_1.1.14
  [8] GenomicFeatures_1.11.16
  [9] GenomicRanges_1.11.44
[10] IRanges_1.17.42
[11] AnnotationDbi_1.21.16
[12] Biobase_2.19.3
[13] BiocGenerics_0.5.6

loaded via a namespace (and not attached):
  [1] biomaRt_2.15.1      Biostrings_2.27.14  bitops_1.0-5
  [4] BSgenome_1.27.1     graph_1.37.7        RBGL_1.35.0
  [7] RCurl_1.95-4.1      Rsamtools_1.11.27   rtracklayer_1.19.11
[10] stats4_3.0.0        tools_3.0.0         XML_3.96-1.1
[13] zlibbioc_1.5.0



On 4/2/2013 9:43 AM, Ido Tamir wrote:
> Hi,
> how is one supposed to go from ucsc known gene id to gene symbols.
>
>> cols(TxDb.Mmusculus.UCSC.mm9.knownGene)
>   [1] "CDSID"      "CDSNAME"    "CDSCHROM"   "CDSSTRAND"  "CDSSTART"
>   [6] "CDSEND"     "EXONID"     "EXONNAME"   "EXONCHROM"  "EXONSTRAND"
> [11] "EXONSTART"  "EXONEND"    "GENEID"     "TXID"       "EXONRANK"
> [16] "TXNAME"     "TXCHROM"    "TXSTRAND"   "TXSTART"    "TXEND"
>
> I don't see anything that would me allow to link this with e.g. Mus.musculus
>
>> select(txdb, keys=c(100009600), cols=cols(txdb) ,keytype="GENEID")
>       GENEID  CDSID CDSNAME CDSCHROM CDSSTRAND CDSSTART   CDSEND EXONID EXONNAME
> 1 100009600 112799<NA>      chr9         - 20871384 20871523 129355<NA>
> 2 100009600 112798<NA>      chr9         - 20870468 20870821 129354<NA>
> 3 100009600 112797<NA>      chr9         - 20867758 20867840 129353<NA>
> 4 100009600 112796<NA>      chr9         - 20867338 20867431 129352<NA>
> 5 100009600 112795<NA>      chr9         - 20867032 20867161 129351<NA>
>    EXONCHROM EXONSTRAND EXONSTART  EXONEND  TXID EXONRANK     TXNAME TXCHROM
> 1      chr9          -  20871384 20872369 28943        1 uc009veu.1    chr9
> 2      chr9          -  20870468 20870821 28943        2 uc009veu.1    chr9
> 3      chr9          -  20867758 20867840 28943        3 uc009veu.1    chr9
> 4      chr9          -  20867338 20867431 28943        4 uc009veu.1    chr9
> 5      chr9          -  20866837 20867161 28943        5 uc009veu.1    chr9
>    TXSTRAND  TXSTART    TXEND
> 1        - 20866837 20872369
> 2        - 20866837 20872369
> 3        - 20866837 20872369
> 4        - 20866837 20872369
> 5        - 20866837 20872369
>
>> cols(Mus.musculus)
>   [1] "GOID"         "TERM"         "ONTOLOGY"     "DEFINITION"   "ENTREZID"
>   [6] "PFAM"         "IPI"          "PROSITE"      "ACCNUM"       "ALIAS"
> [11] "CHR"          "CHRLOC"       "CHRLOCEND"    "ENZYME"       "PATH"
> [16] "PMID"         "REFSEQ"       "SYMBOL"       "UNIGENE"      "ENSEMBL"
> [21] "ENSEMBLPROT"  "ENSEMBLTRANS" "GENENAME"     "UNIPROT"      "GO"
> [26] "EVIDENCE"     "GOALL"        "EVIDENCEALL"  "ONTOLOGYALL"  "MGI"
> [31] "CDSID"        "CDSNAME"      "CDSCHROM"     "CDSSTRAND"    "CDSSTART"
> [36] "CDSEND"       "EXONID"       "EXONNAME"     "EXONCHROM"    "EXONSTRAND"
> [41] "EXONSTART"    "EXONEND"      "GENEID"       "TXID"         "EXONRANK"
> [46] "TXNAME"       "TXCHROM"      "TXSTRAND"     "TXSTART"      "TXEND"
>
>
>> select(Mus.musculus,keys="uc009veu.1", cols=c("SYMBOL"), keytype="TXNAME")
> Error in .testIfKeysAreOfProposedKeytype(x, keys, keytype) :
>    None of the keys entered are valid keys for the keytype specified.
>
> thank you very much,
> ido
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list