[BioC] extracting all GO terms from GOHyperGResult object?

James F. Reid james.reid at ifom-ieo-campus.it
Thu Jan 21 17:20:20 CET 2010


Dear Jenny,

It looks like the pvalue parameter in the summary of a hyperGTest is 
strict, so 184 have p<1 while the rest are exactly 1. A quick, brute 
force was to overcome this is to call the summary with a pvalue greater 
than one,
 > dim(summary(hgOver, pvalue=1.1))
[1] 356   7
A more elegant way is to inspect the content of the results:
 > slotNames(hgOver)
[1] "goDag"         "pvalue.order"  "conditional"   "annotation"
[5] "geneIds"       "testName"      "pvalueCutoff"  "testDirection"

Tested GO ids are contained in the goDag slot, the nodes of which are 
the tested GO ids, so:
 >length(nodes((goDag(hgOver))))
[1] 356
 > nodes((goDag(hgOver)))[1:3]
   GO:0000002   GO:0000018   GO:0000279
"GO:0000002" "GO:0000018" "GO:0000279"

HTH.
J.

On 21/01/2010 16:33, Jenny Drnevich wrote:
> Hi all,
>
> I've been successfully using the GOstats package for a while now to do
> testing for over-representation of GO terms. I'd also like to use it as
> a quick way to output all the GO terms that get tested. However, I can't
> get the GOHyperGResult object to output all the GO terms that it says it
> tested, it will only output those that are below the pvalueCutoff
> specified. Even when I raise the pvalueCutoff to 1 (max allowed value),
> I still can't get all the terms. Here's a reproducible example:
>
>  > library(ALL)
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
> Vignettes contain introductory material. To view, type
> 'openVignette()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>  > library(GOstats)
> Loading required package: Category
> Loading required package: AnnotationDbi
> Loading required package: graph
> Loading required package: DBI
>  > library(hgu95av2.db)
> Loading required package: org.Hs.eg.db
>  >
>  > data(ALL, package = "ALL")
>  >
>  >
>  > sel.IDs <-
> unique(unlist(mget(featureNames(ALL)[1:20],hgu95av2ENTREZID)))
>  > uni.IDs <- unique(unlist(mget(featureNames(ALL),hgu95av2ENTREZID)))
>  >
>  >
>  > params <- new("GOHyperGParams", geneIds=sel.IDs,
> universeGeneIds=uni.IDs,
> + annotation="hgu95av2.db",ontology="BP",pvalueCutoff=1, conditional=T,
> + testDirection="over")
>  >
>  > hgOver <- hyperGTest(params)
>  > hgOver
> Gene to GO BP Conditional test for over-representation
> 356 GO BP ids tested (184 have p < 1)
> Selected gene set size: 18
> Gene universe size: 7685
> Annotation package: hgu95av2
>  >
>  > dim(summary(hgOver))
> [1] 184 7
>
> As you can see, the hgOver object says that it tested 356 GO BP ids, but
> only 184 have p < 1, so the summary(hgOver) only has 184 rows. Is there
> any easy way to get all 356 GO terms out, along with their ExpCount,
> Count, Size, etc.?
>
> Thanks,
> Jenny
>
>  >
>  > sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] GO.db_2.3.5 hgu95av2.db_2.3.5 org.Hs.eg.db_2.3.6
> [4] GOstats_2.12.0 RSQLite_0.8-0 DBI_0.2-5
> [7] graph_1.24.1 Category_2.12.0 AnnotationDbi_1.8.1
> [10] ALL_1.4.7 Biobase_2.6.1
>
> loaded via a namespace (and not attached):
> [1] annotate_1.24.0 genefilter_1.28.2 GSEABase_1.8.0 RBGL_1.22.0
> [5] splines_2.10.1 survival_2.35-7 tools_2.10.1 XML_2.6-0
> [9] xtable_1.5-6
>  >
>
>
>
>
> Jenny Drnevich, Ph.D.
>
> Functional Genomics Bioinformatics Specialist
> W.M. Keck Center for Comparative and Functional Genomics
> Roy J. Carver Biotechnology Center
> University of Illinois, Urbana-Champaign
>
> 330 ERML
> 1201 W. Gregory Dr.
> Urbana, IL 61801
> USA
>
> ph: 217-244-7355
> fax: 217-265-5066
> e-mail: drnevich at illinois.edu
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list