[BioC] KEGG REST: retrieving genes

Hooiveld, Guido Guido.Hooiveld at wur.nl
Wed Jan 30 09:59:25 CET 2013


Hi Dan,
Actually, after a slight modification your suggestion does work!
I realized that the pathways referred to as 'mapxxxxx' are actually the reference pathways; to retrieve the genes for a specific organism 'map' has to be replaced by the abbreviation of that specific organism, e.g. 'hsa' or 'mmu'.
Thus, all human genes that are in the Arachidonic Acid Metabolism pathway:

> head(keggGet("hsa00590")[[1]]$GENE)
                                                             8399 
    "PLA2G10; phospholipase A2, group X [KO:K01047] [EC:3.1.1.4]" 
                                                            26279 
  "PLA2G2D; phospholipase A2, group IID [KO:K01047] [EC:3.1.1.4]" 
                                                            30814 
  "PLA2G2E; phospholipase A2, group IIE [KO:K01047] [EC:3.1.1.4]" 
                                                            50487 
   "PLA2G3; phospholipase A2, group III [KO:K01047] [EC:3.1.1.4]" 
                                                            64600 
  "PLA2G2F; phospholipase A2, group IIF [KO:K01047] [EC:3.1.1.4]" 
                                                            81579 
"PLA2G12A; phospholipase A2, group XIIA [KO:K01047] [EC:3.1.1.4]" 
>

Thanks,
Guido

-----Original Message-----
From: Dan Tenenbaum [mailto:dtenenba at fhcrc.org] 
Sent: Wednesday, January 30, 2013 00:47
To: Hooiveld, Guido
Cc: bioconductor at r-project.org
Subject: Re: [BioC] KEGG REST: retrieving genes

Hi Guido,


On Tue, Jan 29, 2013 at 2:24 PM, Hooiveld, Guido <Guido.Hooiveld at wur.nl> wrote:
> Hi,
> I am exploring the package KEGG REST.
> I would like to retrieve the genes that belong to a specific pathway, e.g. all human genes that are in the Arachidonic Acid Metabolism pathway (= map00590). For now the topology of the pathway is not of relevance to me.
> I have checked the KEGG REST vignette but could not find how to do this, so if this is possible a pointer would be appreciated.
>

Normally the answer would be:

keggGet("map00590")[[1]]$GENE

But it looks like KEGG does not have gene data for this particular pathway (see the underlying URL, http://rest.kegg.jp/get/path:map00590, we expect a GENE section like you'd see in a different pathway, such as
http://rest.kegg.jp/get/path:hsa05200)

You can find some (possibly outdated) genes for this pathway by doing the following:
library(org.Hs.eg.db)
select(org.Hs.eg.db, "00590", cols=c("ENTREZID","SYMBOL"), keytype="PATH")

This is old KEGG data and I do not know why their REST interface doesn't contain this data.

> Thanks,
> Guido
>
> As a side node (for the maintainer): I noticed that the API has recently been updated (18 January 2013); a.o. KGML files can now be retrieved and also conversion options from/to KEGG IDs has been expanded.


Thanks! I will update the package.

Dan

>
>> sessionInfo()
> R Under development (unstable) (2012-11-21 r61136)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United 
> States.1252 [3] LC_MONETARY=English_United States.1252 [4] 
> LC_NUMERIC=C [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] KEGGREST_0.99.1
>
> loaded via a namespace (and not attached):
> [1] BiocGenerics_0.5.6 Biostrings_2.27.10 digest_0.6.2       httr_0.2
>  [5] IRanges_1.17.30    parallel_2.16.0    png_0.1-4          RCurl_1.91-1.1
>  [9] stats4_2.16.0      stringr_0.6.2      tools_2.16.0
>>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list