[BioC] hyperGTest on KEGG and PFAM with org.XX.eg annotations

Marc Carlson mcarlson at fhcrc.org
Tue Dec 2 00:25:43 CET 2008


Hi James,

The current release and devel versions of Category have been patched. 
Barring any incident, these should now work more reliably and should
also represent things more consistently internally.  The latest versions
of Category to look for are:  2.9.8 for devel and 2.8.2 for release. 
Thank you for reporting the bug.


  Marc



James F. Reid wrote:
> Dear list,
>
> hyperGTest behaves differently when using org.XX.eg.db packages
> compared to microarray based ones, like hgu95av2.db for example, for
> doing a KEGG analysis. hyperGTest complains if the annotation string
> does not end with the suffix ".db", it works if you add it but then
> you can't run a summary on the result. A quick fix is to re-assign the
> ".db"-less string to the annotation slot of the hyperGTest result.
> So I am wondering if I am doing something wrong of if it is a bug.
>
> For the PFAM analysis everything works fine except that in the summary
> output the Term (Description) is just the PFAMID which is not very
> useful for interpretation. I think this could easily be fixed by using
> the same approach as for the KEGG output in the PFAMHyperGResult
> summary method:
> ## implicit require("PFAM.db")
> pfamEnv <- getAnnMap("DE", "PFAM", load=TRUE)
> pfamTerms <- unlist(mget(pfamIds, pfamEnv, ifnotfound=NA))
>
>
> Many thanks,
> James.
>
> Here is a session reporting the problem:
>
> library("Category")
> library("org.Hs.eg.db")
>
> set.seed(123)
> geneBackground <- Lkeys(org.Hs.egPATH)
> geneList <- sample(geneBackground, 500)
>
> params <- new("KEGGHyperGParams",
>               geneIds = geneList,
>               universeGeneIds = geneBackground,
>               annotation = "org.Hs.eg")
> hgKEGG <- hyperGTest(params)
> #  Error in get(paste(lib, name, sep = "")) :
> #    variable "org.Hs.egPATH2PROBE" was not found
>
> params at annotation <- "org.Hs.eg.db"
> hgKEGG <- hyperGTest(params)
> summary(hgKEGG)
> #  Error in get(paste(annotation(object), "ORGANISM", sep = "")) :
> #    variable "org.Hs.eg.dbORGANISM" was not found
>
> hgKEGG at annotation <- "org.Hs.eg"
> summary(hgKEGG)
> #  KEGGID      Pvalue OddsRatio  ExpCount Count Size
> #1  05130 0.003282103  7.314332 0.6239536     4   51
> #2  05131 0.003282103  7.314332 0.6239536     4   51
> #                                          Term
> #1 Pathogenic Escherichia coli infection - EHEC
> #2 Pathogenic Escherichia coli infection - EPEC
>
>
> > sessionInfo()
> R version 2.8.0 (2008-10-20)
> i486-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
>
> attached base packages:
> [1] splines   tools     stats     graphics  grDevices utils     datasets
> [8] methods   base
>
> other attached packages:
>  [1] KEGG.db_2.2.5       org.Hs.eg.db_2.2.6  RSQLite_0.7-1
>  [4] DBI_0.2-4           Category_2.8.1      genefilter_1.22.0
>  [7] survival_2.34-1     annotate_1.20.1     xtable_1.5-4
> [10] AnnotationDbi_1.4.1 graph_1.20.0        Biobase_2.2.1
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 GSEABase_1.4.0  RBGL_1.18.0     XML_1.98-1
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list