[BioC] serious problem with GOstats package

James W. MacDonald jmacdon at uw.edu
Thu May 9 16:25:41 CEST 2013


Hi Greg,

It doesn't really matter what you think should happen, nor what the 
theory might be. What matters is how the code works and what the 
requirements are. In this case the code requires that you pass in an 
annotation package name, which is then used to map the Entrez Gene IDs 
to GO terms.

You are expecting the GOstats package to take your Entrez Gene IDs and 
then somehow look them up and figure out what the species might be, 
against all evidence to the contrary. I suppose this could 
hypothetically be done, but that isn't how the code works. So if you 
look at the vignette, you will see this:

params <- new("GOHyperGParams",
+ geneIds=selectedEntrezIds,
+ universeGeneIds=entrezUniverse,
+ annotation="hgu95av2.db",
+ ontology="BP",
+ pvalueCutoff=hgCutoff,
+ conditional=FALSE,
+ testDirection="over")

Please note that both the universe AND the annotation package are 
included. Now if we delve into the code, we can see that one of the 
first steps in the hyperGTest function is to build the universe:

 > GOstats:::.hyperGTestInternal
function (p)
{
     p <- makeValidParams(p)
     p at universeGeneIds <- universeBuilder(p)

and if we delve a bit deeper, we find this in the getUniverseViaGo function

 > getAnywhere(getUniverseViaGo)
A single object matching ‘getUniverseViaGo’ was found
It was found in the following places
   namespace:Category
with value

function (p)
{
     datPkg <- p at datPkg
     ontology <- ontology(p)
     entrezIds <- universeGeneIds(p)
     ontology <- match.arg(ontology, c("BP", "CC", "MF"))
     ontIds <- aqListGOIDs(ontology)
     probe2go <- eapply(ID2GO(datPkg), function(goids) {

Now that ^^^^^^^^ line looks pretty close to your error, no?

So what exactly is p at datPkg?

 > showClass("GOHyperGParams")
Class "GOHyperGParams" [package "Category"]

Slots:

Name:           ontology       conditional           geneIds   
universeGeneIds
Class:         character           logical               
ANY               ANY

Name:         annotation            datPkg categorySubsetIds      
categoryName
Class:         character            DatPkg               ANY         
character

Name:       pvalueCutoff     testDirection
Class:           numeric         character

Extends: "HyperGParams"

It's the annotation package name that you neglected to include when you 
built your GOHyperGParams object. And just to check, to make super sure 
I am right,

 > p <- new("GOHyperGParams", geneIds = "1", universeGeneIds = 
c("1","2","3"), annotation = "org.Hs.eg.db")
 > p at datPkg
An object of class "Org.XX.egDatPkg"
Slot "name":
[1] "org.Hs.eg"

So please try again, this time including the annotation package as I 
suggested before.

Best,

Jim





On 5/9/2013 7:31 AM, gregory voisin wrote:
> HI Jim,
>
> In my mind , in GO term analysis theory : two arguments are necessary 
> : the genelist selection and the genelistUnivers.^The annotation will 
> be necessary only if the universeGeneIds  is absent.
> After Gostats knows these submitted EntrezID are HUMAN but not the 
> experiement is based on Agilent platform.
> Here , in this exemple : onto = "BP"
>
>
> See a previous mail to Dan with a complete example of data and code.
>
> Thansk for your help
> ------------------------------------------------------------------------
> *De :* James W. MacDonald <jmacdon at uw.edu>
> *À :* gregory voisin <voisingreg at yahoo.fr>
> *Cc :* bioconductor <bioconductor at stat.math.ethz.ch>
> *Envoyé le :* Mercredi 8 mai 2013 19h19
> *Objet :* Re: [BioC] serious problem with GOstats package
>
> Hi Greg,
>
> On 5/8/2013 11:38 AM, gregory voisin wrote:
> > Hi,Â
> >
> > FOr my current analysis, I use GOstats package because it's a good, 
> basic, simple package.
> > I don't know why but since some time ,
> >
> > when I use this code:
> >
> > params<- new("GOHyperGParams", geneIds= sigLL, universeGeneIds = 
> universeGeneIds , ontology=onto, pvalueCutoff= 
> 0.01,conditional=FALSE,testDirection="over")
>
> You are missing the annotation argument.
>
> Best,
>
> Jim
>
>
> > Â
> > hgOver<- hyperGTest(params)
> >
> > I have this error message:
> >
> >
> > Erreur dans eapply(ID2GO(datPkg), function(goids) { :Â
> >   erreur d'évaluation de l'argument 'env' lors de la sélection 
> d'une méthode pour la fonction 'eapply' : Erreur dans (function 
> (classes, fdef, mtable) Â :Â
> >   unable to find an inherited method for function ‘cols’ for 
> signature ‘"function"’
> >
> > I have tested with R2.15.1, 2.12.! always the same problem. I think 
> that the update of the package is sometimes problematic.
> >
> >
> > If you have a solution, a suggestion or an alternative ( I'm going 
> to see topGO)
> >
> > Â
> >
> > Thanks for your help
> >
> >
> >> sessionInfo()
> > R version 3.0.0 (2013-04-03)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> >
> > locale:
> > [1] LC_COLLATE=French_Canada.1252 Â LC_CTYPE=French_Canada.1252 Â  Â 
> LC_MONETARY=French_Canada.1252
> > [4] LC_NUMERIC=C Â  Â  Â  Â  Â  Â  Â  Â  Â  
> LC_TIME=French_Canada.1252 Â  Â
> >
> > attached base packages:
> > [1] parallel  stats     graphics  grDevices utils     datasets 
>  methods   base    Â
> >
> > other attached packages:
> > Â [1] hgu95av2.db_2.9.0 Â  Â org.Hs.eg.db_2.9.0 Â  ALL_1.4.14 Â  Â  
> Â  Â  Â  topGO_2.12.0 Â  Â  Â  Â
> > Â [5] SparseM_0.99 Â  Â  Â  Â  GO.db_2.9.0 Â  Â  Â  Â  Â 
> GOstats_2.26.0 Â  Â  Â  RSQLite_0.11.3 Â  Â  Â
> > Â [9] DBI_0.2-6 Â  Â  Â  Â  Â  Â graph_1.38.0 Â  Â  Â  Â  
> Category_2.26.0 Â  Â  Â AnnotationDbi_1.22.5
> > [13] Biobase_2.20.0 Â  Â  Â  BiocGenerics_0.6.0 Â  limma_3.16.3 Â  
> Â  Â  Â  BiocInstaller_1.10.1
> >
> > loaded via a namespace (and not attached):
> > Â [1] annotate_1.38.0 Â  Â  Â  AnnotationForge_1.2.1 
> genefilter_1.42.0 Â  Â  grid_3.0.0 Â  Â  Â  Â  Â Â
> > Â [5] GSEABase_1.22.0 Â  Â  Â  IRanges_1.18.0 Â  Â  Â  Â 
> lattice_0.20-15 Â  Â  Â  RBGL_1.36.2 Â  Â  Â  Â  Â
> > Â [9] splines_3.0.0 Â  Â  Â  Â  stats4_3.0.0 Â  Â  Â  Â  Â 
> survival_2.37-4 Â  Â  Â  tools_3.0.0 Â  Â  Â  Â  Â
> > [13] XML_3.96-1.1 Â  Â  Â  Â  Â xtable_1.7-1 Â  Â  Â  Â Â
> >     [[alternative HTML version deleted]]
> >
> >
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list