[BioC] retrieve genes names after KEGG hypergeometric test

Mike Walter michael_walter at email.de
Tue Nov 2 10:37:53 CET 2010


Hi Clémentine,

The "db" just adds a ".db" suffix to load the library. Just try paste(yourDb, "db", sep="."). This will give "yourDb.db". Maybe a short example will help. There is a list of 8 probesets from tha affy rat 230 2.0 array. I'm looking which of these transcripts are involved in the KEGG pathways 04710 (circadian rhythm) and 00240 (Pyrimidine metabolism):

Regards, Mike

> genelist=c("1368303_at", "1378745_at", "1392640_at", "1369996_at", 
+                   "1370295_at", "1378180_at", "1388182_at", "1398875_at")
> KEGGID=c("04710", "00240")
> db="rat2302" #which will load rat2302.db annotation package
> 
> myKEGG = KEGG2symbol(KEGGID, genelist, db)
> myKEGG
$`04710`
                         [,1]  
1368303_at "Per2"
1378745_at "Per3"
1392640_at "Cry1"

$`00240`
                          [,1]    
1369996_at "Polr2f"
1370295_at "Nme1"  
1378180_at "Dctd"  
1388182_at "Prim1" 
1398875_at "Polr3k"


-----Ursprüngliche Nachricht-----
Von: "Clémentine Dressaire" <clementinedressaire at itqb.unl.pt>
Gesendet: 29.10.2010 15:21:33
An: "Mike Walter" <michael_walter at email.de>
Betreff: Re: [BioC] retrieve genes names after KEGG hypergeometric test

>
>Hi Mike,
>
> 
>
>Could ou explain me the difference between the db and "db" you are using?
>
>If db is the character vector with the annotation database for your array
>
>without the .db extension, then what does db represent?
>
>
>
>Again thanks for your help,
>
>
>
>Clémentine
>
>
>
>
>
>On Fri, 29 Oct 2010 14:23:00 +0200 (CEST), "Mike Walter"
>
><michael_walter at email.de> wrote:
>
>> Hi Clémentine,
>
>> 
>
>> I don't know, if such a function exists. I use two little helper
>
>functions
>
>> to retrieve probe IDs or gene symbols of genes in a genelist, that are
>
>> associated with a KEGG ID:
>
>> 
>
>> KEGG2genes = function(KEGGID, genelist, db){
>
>>  require(paste(db, "db", sep="."), character.only = TRUE)
>
>>  l = vector("list")
>
>>  for (i in 1:length(KEGGID)){
>
>>  kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE",
>
>>  sep="")), ifnotfound=NA)))
>
>>  l[[i]] = genelist[is.element(genelist,kegg[,1])]
>
>>  }
>
>> names(l)=KEGGID
>
>> l
>
>> }
>
>> 
>
>> KEGG2symbol = function(KEGGID, genelist, db){
>
>>  l = vector("list")
>
>>  for (i in 1:length(KEGGID)){
>
>>  id = unlist(KEGG2genes(KEGGID=KEGGID[i], genelist=genelist, db=db))
>
>>  l[[i]] = as.matrix(mget(id, get(paste(db, "SYMBOL", sep="")),
>
>>  ifnotfound=NA))
>
>>  }
>
>>  names(l)=KEGGID
>
>>  l
>
>> }
>
>> 
>
>> where "KEGGID" is a character vector of your KEGGID(s) you are
>
>interested
>
>> in, "genelist" is a character vector containing the probe IDs/probeset
>
>IDs
>
>> of your genelist you used to create the KEGGHyperGResult and "db" is a
>
>> character vector with the annotation database for your array without the
>
>> .db extension (e.g. db="hgu133plus" for the affy U133+ 2.0 array). As a
>
>> result you get a matrix containing the probeIDs and genesymbols for each
>
>> KEGGID stored in a list. It might not be the most elegant way, but it
>
>> works. 
>
>> 
>
>> Kind regards, 
>
>> 
>
>> Mike
>
>> 
>
>> -----Ursprüngliche Nachricht-----
>
>> Von: "Clémentine Dressaire" <clementinedressaire at itqb.unl.pt>
>
>> Gesendet: 29.10.2010 13:27:44
>
>> An: bioconductor at stat.math.ethz.ch
>
>> Betreff: [BioC] retrieve genes names after KEGG hypergeometric test
>
>> 
>
>>>
>
>>>Dear BioC users,
>
>>>
>
>>>
>
>>>
>
>>>I performed different hypergometric tests on my data regarding GO terms
>
>>>
>
>>>and KEGG pathways. With GO resukt I can use the probeSetSummary function
>
>>>to
>
>>>
>
>>>retrieve the gene list associated to each significant category.
>
>>>
>
>>>However this function does not work if I apply the HG test using
>
>>>
>
>>>KEGGHyperGParams because the results are not of GOHyperGResult class...
>
>Is
>
>>>
>
>>>there any equivalent KEGG function to get those genes list? 
>
>>>
>
>>>
>
>>>
>
>>>WIth advanced thanks for your help.
>
>>>
>
>>>
>
>>>
>
>>>Clémentine 
>
>>>
>
>>>
>
>>>
>
>>>-- 
>
>>>
>
>>>Clémentine Dressaire
>
>>>
>
>>>Post-doctoral research fellow
>
>>>
>
>>>Control of gene expression lab
>
>>>
>
>>>ITQB - Instituto de Tecnologia Química e Biológica
>
>>>
>
>>>Apartado 127, Av. da República
>
>>>
>
>>>2780-157 Oeiras
>
>>>
>
>>>Portugal
>
>>>
>
>>>+351 214469562
>
>>>
>
>>>_______________________________________________
>
>>>Bioconductor mailing list
>
>>>Bioconductor at stat.math.ethz.ch
>
>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>>>Search the archives:
>
>>>http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list