[BioC] Gene symbol to KEGG gene ids
mcarlson at fhcrc.org
Fri Apr 17 17:54:15 CEST 2009
To get get the KEGG IDs associated with a particular gene symbol you
must first convert the gene symbols from the org.Hs.eg.db package into
Entrez Gene IDs. There are two reasons for this. The 1st is that we
never want to use gene symbols as primary identifiers because they are
not unique. And the 2nd reason is because the org.Hs.eg.db package is
Entrez Gene centric. So if you have an Entrez Gene ID, then you can get
to every other piece of information in the org.Hs.eg.db package database.
To achieve this we can do the following:
##Toy example symbols:
sym = c("AKT3","CDH1")
##Get the Entrez gene IDs associated with those symbols
EG_IDs = mget(sym, revmap(org.Hs.egSYMBOL),ifnotfound=NA)
##Then get the KEGG IDs associated with those entrez genes.
KEGG_IDs = mget(as.character(EG_IDs), org.Hs.egPATH,ifnotfound=NA)
Please let me know if you have more questions.
Daniel Brewer wrote:
> The org.Hs.eg.db package provides annotations that link a gene to a
> particular KEGG pathway. What I would like to know is what is what are
> the KEGG ids associated with this gene symbol. This information does
> not seem to be available in either KEGG.db or org.Hs.eg.db, but must be
> used to construct the annotation files. Does anyone know how to get
> this info?
More information about the Bioconductor