[BioC] problem with hyperGTest function

Seth Falcon sfalcon at fhcrc.org
Tue Jun 5 16:50:57 CEST 2007


Hi Andrzej,

[I'm sending a copy to the bioconductor list.  Please send further
questions about GOstats to the Bioconductor list so others can
benefit from the answers and chime in with suggestions]

Andrzej Polański <Andrzej.Polanski at polsl.pl> writes:
> I am really enthusiastic about your implementation of hypergeometric
> tests to problems in GO analyses.  I have been working on it for a
> couple of weeks. I was so happy when I found your GOstats package
> within Bioconductor. Since I am not an expert on R (I do my analyses
> using Matlab), I am experiencing some problems with your hyperGtest
> function. Below please find an exemplary code:
>  

First of all, have you read over the vignette that comes with GOstats?
Try this if you have not:

    openVignette(package="GOstats")

    ## then select 1 which should open a PDF viewer for you

> library(hgu133a); 
> affySample <- c("204724_s_at", "214349_at", "205054_at", "209875_s_at", "205700_at");
> geneSample <- as.vector(unlist(mget(affySample, hgu133aACCNUM, ifnotfound=NA))); 

You want to use EntrezGene identifiers to define the selected gene
list and the gene universe.  So you want:

   geneSample <- unlist(mget(affySample, hgu133aENTREZID))  
                       ## but should still filter out NA's, etc

> library(hgu133acdf);
> affyUniverse <- ls(hgu133acdf);

You don't need the hgu133acdf package for this.  If you want to use as
your universe all genes probed on the chip, then you can do:

   chipEntrezUniverse <- as.list(hgu95av2ENTREZID)
   chipEntrezUniverse <- unique(unlist(chipEntrezUniverse))

But you really do not, IMHO, want to use all the genes on the entire
chip as your universe (read through the vignette for why).

You can also save yourself a lot of effort by using the nsFilter
function to do the non-specific filtering of your data.

> geneUniverse <- as.vector(unlist(mget(affyUniverse, hgu133aACCNUM, ifnotfound=NA))); 
> params <- new("GOHyperGParams", geneIds = geneSample, universeGeneIds = geneUniverse, annotation="hgu133a", ontology = "BP", pvalueCutoff = 0.9, conditional = FALSE, testDirection = "over"); 
>
> It works fine till that. Then, when I run:
>  
> hgOver <- hyperGTest(params); 
> summary(hgOver); 
>  
> The following error appears:
>
> Error in getUniverseHelper(probes, datPkg, entrezIds) : 
> No Entrez Gene ids left in universe

Note that the error happens during the call to hyperGTest, not the
call to summary().

> Could you, please, tell me, what is wrong?

You haven't given the right IDs.

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org



More information about the Bioconductor mailing list