[BioC] GO.db and Gostats question

Tarca, Adi atarca at med.wayne.edu
Thu Sep 3 22:20:35 CEST 2009


Hi all,

I have a question regarding the way in which GOstats scans the GO tree and counts how many genes in a given list belong to those go terms.
Here is how I would look for all BP GO terms that have one or more genes in a given list called ALL:     

library(org.Hs.eg.db)
library(GO.db)
ALL=c("6462","54784","386678","1572","57830","51764","85443","387644","54807","401106","11012","26529")
xyy<- as.list(org.Hs.egGO2ALLEGS)
x1 <- as.list(GOTERM)
alGO=names(xyy)
out<-list()
for (i in 1:length(alGO)){
mygo=alGO[i]
if(Ontology(x1[[mygo]])=="BP"){
genesgo=intersect(as.vector(xyy[[mygo]]),ALL)
out[[i]]<-genesgo
}
}
table(unlist(lapply(out,length)))


I am not concerned here with the speed of the process by merely finding the same BP terms that GOstats would find. I see that hyperGTest uses both an universeGeneIds argument but also an annotation argument, and I am not sure why the annotation argument is needed given that the universeGeneIds is provided.
Note that I can not get the list of all terms tested by GOstats using something like:

library(GOstats)
params <- new("GOHyperGParams", geneIds = ALL,
 universeGeneIds = ALL, annotation = "hgu133plus2.db",
 ontology = "BP", pvalueCutoff = 1, conditional = FALSE,
 testDirection = "over")
hgCondOver <- hyperGTest(params)

Because all p values will be exactly 1, hyperGTest will return no entries since none are less than the pvalueCutoff. 




Thanks a lot,

Adi 


More information about the Bioconductor mailing list