[BioC] retrieve genes names after KEGG hypergeometric test

Marc Carlson mcarlson at fhcrc.org
Wed Nov 3 19:22:11 CET 2010


Hi guys,

I don't want to jump in here and tell you how to write your code, but it
might simplify your life somewhat to know about a couple of
conveniences.  One is the getAnnMap function from the annotate package. 
This is a nice thing for when you want to be able to just load a mapping up.

So instead of doing stuff like:

require(paste(db, "db", sep="."), character.only = TRUE)

You could do something like this:
library(annotate)
yourMap <- getAnnMap("PATH2PROBE", "hgu95av2.db")

You can basically get whatever mapping you want in a way that will
automatically load the relevant annotation libraries, and append a .db
suffix onto the end of the 'chip' argument (in case you forgot it).  So,
this will also work for the SYMBOL mapping, or any other mapping that
you need.

And then later when you want to retrieve something from a mapping, it
also pays to know that mget() is vectorized  Which means that if you
pass in a vector for "x", it will return a list with all the matching
results attached for each value in "x".  Therefore, instead of using a
for loop like this:

for (i in 1:length(KEGGID)){
 kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE",
sep="")), ifnotfound=NA)))
 l[[i]] = genelist[is.element(genelist,kegg[,1])]
}


I think that you should be able to get basically the same kind of result
by doing something like more like this:

l = unlist(mget(KEGGID, yourMap, ifnotfound=NA))
l = l[l %in% genelist]


Hope this helps you,


  Marc



On 10/29/2010 05:23 AM, Mike Walter wrote:
> Hi Clémentine,
>
> I don't know, if such a function exists. I use two little helper functions to retrieve probe IDs or gene symbols of genes in a genelist, that are associated with a KEGG ID:
>
> KEGG2genes = function(KEGGID, genelist, db){
>  require(paste(db, "db", sep="."), character.only = TRUE)
>  l = vector("list")
>  for (i in 1:length(KEGGID)){
>  kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE", sep="")), ifnotfound=NA)))
>  l[[i]] = genelist[is.element(genelist,kegg[,1])]
>  }
> names(l)=KEGGID
> l
> }
>
> KEGG2symbol = function(KEGGID, genelist, db){
>  l = vector("list")
>  for (i in 1:length(KEGGID)){
>  id = unlist(KEGG2genes(KEGGID=KEGGID[i], genelist=genelist, db=db))
>  l[[i]] = as.matrix(mget(id, get(paste(db, "SYMBOL", sep="")), ifnotfound=NA))
>  }
>  names(l)=KEGGID
>  l
> }
>
> where "KEGGID" is a character vector of your KEGGID(s) you are interested in, "genelist" is a character vector containing the probe IDs/probeset IDs of your genelist you used to create the KEGGHyperGResult and "db" is a character vector with the annotation database for your array without the .db extension (e.g. db="hgu133plus" for the affy U133+ 2.0 array). As a result you get a matrix containing the probeIDs and genesymbols for each KEGGID stored in a list. It might not be the most elegant way, but it works. 
>
> Kind regards, 
>
> Mike
>
> -----Ursprüngliche Nachricht-----
> Von: "Clémentine Dressaire" <clementinedressaire at itqb.unl.pt>
> Gesendet: 29.10.2010 13:27:44
> An: bioconductor at stat.math.ethz.ch
> Betreff: [BioC] retrieve genes names after KEGG hypergeometric test
>
>   
>> Dear BioC users,
>>
>>
>>
>> I performed different hypergometric tests on my data regarding GO terms
>>
>> and KEGG pathways. With GO resukt I can use the probeSetSummary function to
>>
>> retrieve the gene list associated to each significant category.
>>
>> However this function does not work if I apply the HG test using
>>
>> KEGGHyperGParams because the results are not of GOHyperGResult class... Is
>>
>> there any equivalent KEGG function to get those genes list? 
>>
>>
>>
>> WIth advanced thanks for your help.
>>
>>
>>
>> Clémentine 
>>
>>
>>
>> -- 
>>
>> Clémentine Dressaire
>>
>> Post-doctoral research fellow
>>
>> Control of gene expression lab
>>
>> ITQB - Instituto de Tecnologia Química e Biológica
>>
>> Apartado 127, Av. da República
>>
>> 2780-157 Oeiras
>>
>> Portugal
>>
>> +351 214469562
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>     
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list