[BioC] GOstats - hyperGTest using "KEGGHyperGParams"

Robert Gentleman rgentlem at fhcrc.org
Tue Apr 1 22:12:53 CEST 2008


Hi Paul,
   Thanks for the report.  Please, if you use sample also set a seed, 
otherwise your example is not reproducible.

   The short answer is that you cannot use KEGG with the org.XX packages 
in release. Based on your report I have modified the Category package 
(which is doing most of the work), so that this now should work in the 
devel branch, and that change should propagate in the next day or so to 
the web (version 2.5.9).

   best wishes
    Robert


Paul Evans wrote:
> Thanks Robert. I tried the KEGG.db package and tried the 
> KEGGHyperGParams again. The code I used is:
> 
> -----------------------------------------------------------------------------
> 
> ############ TEST hyperGTest for HOMO SAPIENS ######
> library("KEGG.db")
> library("GOstats")
> library("org.Hs.eg.db")
> 
> x <- org.Hs.egACCNUM
> # Get the entrez gene identifiers that are mapped to an ACCNUM
> mapped_genes <- mappedkeys(x)
> geneUniverse <- mapped_genes[1:1200]
> 
> 
> ## Create random cluster of 13 genes
> geneCluster <- sample(1:1200,13,replace=F)
> geneCluster <- unique(unlist(geneUniverse[geneCluster]))
> 
> print(geneCluster)
> 
> paramsGO <- new("GOHyperGParams", geneIds = geneCluster,
>          universeGeneIds = geneUniverse, annotation = "org.Hs.eg.db", 
> ontology = "BP",
>          pvalueCutoff = 1, conditional = FALSE, testDirection = "over")
> 
> 
> paramsKEGG <- new("KEGGHyperGParams", geneIds = geneCluster,
>          universeGeneIds = geneUniverse, annotation = "org.Hs.eg.db",
>          pvalueCutoff = 1, testDirection = "over")
> 
> 
> tryCatch(hgOverGO <- hyperGTest(paramsGO),error = function(e) 
> {print('error GO')})
> tryCatch(hgOverKEGG <- hyperGTest(paramsKEGG),error = function(e) 
> {print('error KEGG')})
> 
> -----------------------------------------------------------------------------
> 
>  
> 
> The output/error I got now is:
> 
>  
> 
> [1] "901"  "599"  "435"  "100"  "1525" "25"   "204"  "1159" "865"  
> "1195" "1629" "912"  "998"
> 
> Error in get(paste(lib, name, sep = "")) :
>   no function to return from, jumping to top level
> [1] "error KEGG"
> 
>  
> 
> My sessionInfo() is:
> 
>  
> 
>  > sessionInfo()
> R version 2.6.2 (2008-02-08)
> i386-pc-mingw32
> 
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
> States.1252;LC_MONETARY=English_United 
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] splines   tools     stats     graphics  grDevices utils     
> datasets  methods   base    
> 
> other attached packages:
>  [1] org.Hs.eg.db_2.0.2  GOstats_2.4.0       Category_2.4.0      
> genefilter_1.16.0   survival_2.34       RBGL_1.14.0         
> annotate_1.16.1   
>  [8] xtable_1.5-2        GO.db_2.0.2         graph_1.16.1        
> KEGG.db_2.0.2       AnnotationDbi_1.0.6 RSQLite_0.6-8       
> DBI_0.2-4         
> [15] Biobase_1.16.3    
> 
> loaded via a namespace (and not attached):
> [1] cluster_1.11.10
>  >
> 
>  
> 
> My apologies if I have missed something elementary!
> 
>  
> 
> thanks!
> 
>  
> 
> 
> 
> ----- Original Message ----
> From: Robert Gentleman <rgentlem at fhcrc.org>
> To: Paul Evans <p.evans48 at yahoo.com>
> Cc: Bioconductor at stat.math.ethz.ch
> Sent: Monday, March 31, 2008 3:45:11 PM
> Subject: Re: [BioC] GOstats - hyperGTest using "KEGGHyperGParams"
> 
> Hi Paul,
>   Thanks for the bug report, it seems that there is an issue when all
> values are zero, which shows up intermittently.  You can solve it by
> using try or tryCatch around the call to hyperGTest.  You can simply use
> a p-value of 1, which is what it will be.
> 
> You should not be loading the GO package for this (KEGG if anything, and
> even then, please use KEGG.db, not KEGG).
> 
>   I will fix the bug, but given how close the release is I won't back
> port it, and it will only be available in the devel branch (soon to be
> the release branch),
> 
>   best wishes
>     Robert
> 
> Paul Evans wrote:
>  > Hi all,
>  >
>  > I was trying to understand the hyperGTest for KEGG, and used the 
> following code:
>  >
>  > 
> -----------------------------------------------------------------------------------------------------------
>  > ## TEST HYPERGTEST FOR KEGG
>  >
>  > library("YEAST")
>  > library("GOstats")
>  > library("GO")
>  >
>  > # Convert to a list
>  > xx <- as.list(YEASTGENENAME)
>  > # Remove probes that do not map to any GENENAME
>  > xx <- xx[!is.na <http://is.na/>(xx)]
>  > if(length(xx) > 0){
>  >    # Gets the gene names for the first five probe identifiers
>  >    xx[1:5]
>  >    # Get the first one
>  >    xx[[1]]
>  > }
>  >
>  > ## Create gene universe
>  > allGenes <- names(xx)
>  > print(length(allGenes))
>  > geneUniverse <- allGenes[1:800]
>  > for(i in 1:20){
>  >    ## Create random cluster of 13 genes
>  >    geneCluster <- sample(1:800,13,replace=F)
>  >    geneCluster <- geneUniverse[geneCluster]
>  >    print(i)
>  >    print(geneCluster)
>  >    params <- new("KEGGHyperGParams", geneIds = geneCluster,
>  >          universeGeneIds = geneUniverse, annotation = "YEAST",
>  >            pvalueCutoff = 0.1, testDirection = "over")
>  >    hgOver <- hyperGTest(params)
>  >    dfrm <- summary(hgOver)
>  >    #print(dfrm)
>  > }
>  >
>  > 
> --------------------------------------------------------------------------------------------------------
>  >
>  > The output/error that I got is:
>  >
>  > [1] 1
>  >  [1] "YKR067W" "MOF9"    "YDR518W" "YPR074C" "YCL011C" "YCR069W" 
> "YDL104C" "YGR136W" "YAR003W" "YFR013W" "YOR116C" "YDR507C" "YGR167W"
>  > [1] 2
>  >  [1] "YJR112W" "CEN8"    "YPL005W" "YHR081W" "YLR323C" "YBR131W" 
> "YLR347C" "YHR098C" "YOR107W" "YCL027W" "YNR012W" "CRL16"  "YLR329W"
>  > [1] 3
>  >  [1] "YNL327W" "YEL056W" "YNL321W" "YDL111C" "YMR284W" "YLR338W" 
> "YPL008W" "CRL17"  "YEL065W" "YFR027W" "YMR269W" "YPL019C" "YML038C"
>  > Error in numW - numWdrawn : non-numeric argument to binary operator
>  >
>  >
>  > [[elided trailing spam]]
>  >
>  > My sessionInfo():
>  >
>  >> sessionInfo()
>  > R version 2.6.2 (2008-02-08)
>  > i386-pc-mingw32
>  > locale:
>  > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
> States.1252;LC_MONETARY=English_United 
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>  > attached base packages:
>  > [1] splines  tools    stats    graphics  grDevices utils    datasets  
> methods  base   
>  > other attached packages:
>  >  [1] KEGG_2.0.1          GOstats_2.4.0      Category_2.4.0      
> genefilter_1.16.0  survival_2.34      RBGL_1.14.0        GO.db_2.0.2       
>  >  [8] graph_1.16.1        goTools_1.10.0      annotate_1.16.1    
> xtable_1.5-2        AnnotationDbi_1.0.6 RSQLite_0.6-8      DBI_0.2-4    
>      
>  > [15] Biobase_1.16.3      GO_2.0.1            hu6800_2.0.1        
> hgu95a_2.0.1        hgu95av2_2.0.1      hgu133plus2_2.0.1  
> hgu133b_2.0.1     
>  > [22] hgu133a_2.0.1      som_0.3-4          YEAST_2.0.1        
> cluster_1.11.10   
>  >
>  >
>  > thanks!
>  >
>  >
>  >      
> ____________________________________________________________________________________
>  > Looking for last minute shopping deals? 
>  >
>  >     [[alternative HTML version deleted]]
>  >
>  > _______________________________________________
>  > Bioconductor mailing list
>  > Bioconductor at stat.math.ethz.ch <mailto:Bioconductor at stat.math.ethz.ch>
>  > https://stat.ethz.ch/mailman/listinfo/bioconductor
>  > Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>  >
> 
> -- 
> Robert Gentleman, PhD
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> PO Box 19024
> Seattle, Washington 98109-1024
> 206-667-7700
> rgentlem at fhcrc.org <mailto:rgentlem at fhcrc.org>
> 
> 
> ------------------------------------------------------------------------
> You rock. That's why Blockbuster's offering you one month of Blockbuster 
> Total Access 
> <http://us.rd.yahoo.com/evt=47523/*http://tc.deals.yahoo.com/tc/blockbuster/text5.com 
>  >, No Cost.

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list