[BioC] Hypergeometric test with Disease Ontology

Gilbert Feng g-feng at northwestern.edu
Thu Jan 27 16:17:38 CET 2011


Hi, Ted

DO.db is a standard sqlite file and you can use standard RSQLite 
procedure to retrieve the information. Actually we do have a lite 
version of Disease Ontology, DOLite that removes some redundant nodes, 
integrated in GeneAnswers package, which also uses hypergeometric test 
to run enrichment analysis as well as automatically generated 
interactive(cytoscape web support) html summary for one or more groups 
of genes.

Best

Gilbert

On 1/27/11 5:00 AM, Ted Morrow wrote:
> Dear all,
>
> I would like to conduct a hypergeometric test on a list of genes but
> using Disease Ontology instead of GO or KEGG terms. The package "DO.db"
> contains this information but I have been unable to find a way of using
> this database in conjunction with the "GOstats" package that I have been
> using.
>
> Has anyone attempted to run a hypergeometric test for Disease Ontology
> terms? Is there another package I could use? Or is there a way of
> modifying the argument (if that's the right word) GOHyperGParams in
> GOstats so that it can make use of the information in DO.db?
>
> Both the GO and KEGG analyses work fine:
> ### GO chunk
> params <- new("GOHyperGParams",
> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
> annotation = "hgu95av2.db", ontology="BP",
> pvalueCutoff=0.01, conditional=FALSE,
> testDirection="over")
> hgOver <- hyperGTest(params)
>
> hgOver
> Gene to GO BP test for over-representation
> 2136 GO BP ids tested (15 have p < 0.01)
> Selected gene set size: 112
> Gene universe size: 951
> Annotation package: hgu95av2
>
> ### KEGG chunk
> paramsKEGG <- new("KEGGHyperGParams",
> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
> annotation = "hgu95av2.db",
> pvalueCutoff=0.01,
> testDirection="over")
>
> hgOverKEGG <- hyperGTest(paramsKEGG)
> hgOverKEGG
> summary(hgOverKEGG)
>
> My data looks like this:
> str(selectedEntrezIds)
> chr [1:157] "60528" "6853" "10157" "5081" "389434" "6591" "7414" "7546"
> "3074" "6916" "6559" "23503" "8626" "6557" "38" "60" "9733" "113235"
> "28962" "10269" "4069" "30835" "7018" ...
>
>  > str(entrezUniverse)
> chr [1:1310] "8813" "3075" "2729" "8379" "204" "170302" "10165" "6521"
> "799" "3052" "1387" "5244" "3674" "6833" "10083" "60528" "8842" "5048"
> "4843" "6329" "5080" "6401" "6853" ...
>
> My naive attempts to use DO have included:
> paramsDO <- new("DOHyperGParams",
> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
> annotation = "DO.db",
> pvalueCutoff=0.01,
> testDirection="over")
>
> Which of course doesn't work and gives the following error:
> Error in getClass(Class, where = topenv(parent.frame())) :
> "DOHyperGParams" is not a defined class
>
>  > traceback()
> 3: stop(gettextf("\"%s\" is not a defined class", Class), domain = NA)
> 2: getClass(Class, where = topenv(parent.frame()))
> 1: new("DOHyperGParams", geneIds = selectedEntrezIds, universeGeneIds =
> entrezUniverse,
> annotation = "DO.db", pvalueCutoff = 0.01, testDirection = "over")
>
>
> Replacing "GOHyperGParams" with "DOHyperGParams" also gives the
> following error:..
>
> hgOverDO <- hyperGTest(paramsDO)
> Error in match.arg(ontology, c("BP", "CC", "MF")) :
> 'arg' should be one of “BP”, “CC”, “MF”
>
> traceback()
> 10: stop(gettextf("'arg' should be one of %s", paste(dQuote(choices),
> collapse = ", ")), domain = NA)
> 9: match.arg(ontology, c("BP", "CC", "MF"))
> 8: getUniverseViaGo(p)
> 7: universeBuilder(p)
> 6: universeBuilder(p)
> 5: .hyperGTestInternal(p)
> 4: is(object, Cl)
> 3: is(object, Cl)
> 2: .valueClassTest(standardGeneric("hyperGTest"), "HyperGResultBase",
> "hyperGTest")
> 1: hyperGTest(paramsDO)
>
>
> Any help would be greatly appreciated.
> /Ted
>
>  > sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
> States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] KEGG.db_2.4.5 DO.db_2.1.0 GO.db_2.4.5 hgu95av2.db_2.4.5
> org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0
> Category_2.16.0 AnnotationDbi_1.12.0
> [12] Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] annotate_1.28.0 genefilter_1.32.0 GSEABase_1.12.2 RBGL_1.26.0
> splines_2.12.1 survival_2.36-2 tools_2.12.1 XML_3.2-0.2 xtable_1.5-6
>
>



More information about the Bioconductor mailing list