[BioC] hyperGTest on KEGG and PFAM with org.XX.eg annotations

James F. Reid james.reid at ifom-ieo-campus.it
Fri Nov 21 11:34:19 CET 2008

Dear list,

hyperGTest behaves differently when using org.XX.eg.db packages compared 
to microarray based ones, like hgu95av2.db for example, for doing a KEGG 
analysis. hyperGTest complains if the annotation string does not end 
with the suffix ".db", it works if you add it but then you can't run a 
summary on the result. A quick fix is to re-assign the ".db"-less string 
to the annotation slot of the hyperGTest result.
So I am wondering if I am doing something wrong of if it is a bug.

For the PFAM analysis everything works fine except that in the summary 
output the Term (Description) is just the PFAMID which is not very 
useful for interpretation. I think this could easily be fixed by using 
the same approach as for the KEGG output in the PFAMHyperGResult summary 
## implicit require("PFAM.db")
pfamEnv <- getAnnMap("DE", "PFAM", load=TRUE)
pfamTerms <- unlist(mget(pfamIds, pfamEnv, ifnotfound=NA))

Many thanks,

Here is a session reporting the problem:


geneBackground <- Lkeys(org.Hs.egPATH)
geneList <- sample(geneBackground, 500)

params <- new("KEGGHyperGParams",
               geneIds = geneList,
               universeGeneIds = geneBackground,
               annotation = "org.Hs.eg")
hgKEGG <- hyperGTest(params)
#  Error in get(paste(lib, name, sep = "")) :
#    variable "org.Hs.egPATH2PROBE" was not found

params at annotation <- "org.Hs.eg.db"
hgKEGG <- hyperGTest(params)
#  Error in get(paste(annotation(object), "ORGANISM", sep = "")) :
#    variable "org.Hs.eg.dbORGANISM" was not found

hgKEGG at annotation <- "org.Hs.eg"
#  KEGGID      Pvalue OddsRatio  ExpCount Count Size
#1  05130 0.003282103  7.314332 0.6239536     4   51
#2  05131 0.003282103  7.314332 0.6239536     4   51
#                                          Term
#1 Pathogenic Escherichia coli infection - EHEC
#2 Pathogenic Escherichia coli infection - EPEC

 > sessionInfo()
R version 2.8.0 (2008-10-20)


attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
  [1] KEGG.db_2.2.5       org.Hs.eg.db_2.2.6  RSQLite_0.7-1
  [4] DBI_0.2-4           Category_2.8.1      genefilter_1.22.0
  [7] survival_2.34-1     annotate_1.20.1     xtable_1.5-4
[10] AnnotationDbi_1.4.1 graph_1.20.0        Biobase_2.2.1

loaded via a namespace (and not attached):
[1] cluster_1.11.11 GSEABase_1.4.0  RBGL_1.18.0     XML_1.98-1

More information about the Bioconductor mailing list