[BioC] GOstats problem with output

Robert M. Flight rflight79 at gmail.com
Thu Apr 7 18:49:54 CEST 2011


Hi Assa,

As far as I am aware, if the GO term comes up in your list, then there
should be genes annotated to it. I did a simple test to verify that
the GO term does exist:

 crud <- as.list(GOTERM)
> crud$'GO:2000021'
GOID: GO:2000021
Term: regulation of ion homeostasis
Ontology: BP
Definition: Any process that modulates the frequency, rate or extent
of ion homeostasis.
Synonym: regulation of electrolyte homeostasis
Synonym: regulation of negative regulation of crystal biosynthesis
Synonym: regulation of negative regulation of crystal formation

So far so good. Now lets look to see what genes are annotated to it:

> library(org.Mm.eg.db)
> mget('GO:2000021',org.Mm.egGO)
Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
  value for "GO:2000021" not found

> mget('GO:2000021',org.Mm.egGO2EG)
Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
  value for "GO:2000021" not found
> mget('GO:2000021',org.Mm.egGO2ALLEGS)
$`GO:2000021`
     ISO      ISO      ISO      ISO      IGI      IGI      IMP
IGI      ISO      ISO      IMP      ISO      ISO      IDA
 "11517"  "11684"  "11998"  "12000"  "12018"  "12028"  "12028"
"12043"  "12061"  "12257"  "12291"  "12349"  "12372"  "12389"
     ISO      ISO      ISO      ISO      ISO      IMP      ISO
ISO      IDA      IMP      IMP      IGI      IGI      ISO
 "12424"  "12558"  "13167"  "13489"  "13617"  "13666"  "14062"
"14126"  "14225"  "14225"  "14226"  "14629"  "14630"  "14652"
     ISO      IDA      IDA      ISO      IDA      ISO       IC
ISO      IMP      IMP      IDA      IMP      ISO      ISO
 "15171"  "15978"  "16818"  "16867"  "16963"  "17096"  "17131"
"18429"  "18439"  "18764"  "19264"  "20190"  "21333"  "21336"
     ISO      ISO      IMP      ISO      ISO      TAS      IDA
ISO      ISO      ISO      ISO      ISO      ISO      ISO
 "21803"  "21808"  "21819"  "21838"  "22041"  "22784"  "23832"
"24111"  "26361"  "50849"  "54140"  "76055"  "76757" "108837"
     ISO      IMP      ISO      ISO      IMP      ISO
"217369" "225908" "233081" "238276" "259277" "317757"

BTW, this was all using GO.db_2.4.5

>From this information, there are no genes that are directly annotated
to your GO term, only indirect annotations. I know this doesn't help
your current situation, but it points towards the problem at least. I
thought, however, when the summary was being prepared that it used the
GO2ALLEGS mapping, and not the direct one. Perhaps someone more
knowledgeable can figure out where in the code the error is likely to
be?

-Robert

Robert M. Flight, Ph.D.
University of Louisville Bioinformatics Laboratory
University of Louisville
Louisville, KY

PH 502-852-1809 (HSC)
PH 502-852-0467 (Belknap)
EM robert.flight at louisville.edu
EM rflight79 at gmail.com

Williams and Holland's Law:
       If enough data is collected, anything may be proven by
statistical methods.



On Thu, Apr 7, 2011 at 11:22, Assa Yeroslaviz <frymor at gmail.com> wrote:
> Hi,
>
> I am trying to run a HyerGTest with GOstats on a mouse genome entrez IDs.
>
> The Ids I have imported from biomart:
> entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"), filters=
> "mgi_id", values = as.character(data_1$MGI),mart = mart)
> head(entrez_data_1)
> entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id",
> "entrezgene"), filters ="mgi_id", values =as.character(MaxQuant18$MGI))
> entrezID_Universe
> params <- new("GOHyperGParams", geneIds = as.character(entrez_data_1[,2]),
> universeGeneIds = as.character(entrezID_Universe[,2]), annotation =
> "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional = FALSE,
> testDirection = "over")
> I Than tried to run the HyperGTest command with success
> MmOverBP <- hyperGTest(paramsBP)
> MmOverBP
> Gene to GO BP  test for over-representation
> 3146 GO BP ids tested (118 have p < 0.05)
> Selected gene set size: 1006
>    Gene universe size: 2935
>    Annotation package: org.Mm.eg
> but than:
> summary(MmOverBP)
>> summary(MmOverBP)
> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
>  value for "GO:2000021" not found
>
> As far as I know, I have the latest version of both packages. I looked in
> AmiGO whether this GO Id exists: it does.
> AccessionGO:2000021OntologyBiological ProcessSynonymsrelated: regulation of
> electrolyte homeostasis related: regulation of negative regulation of
> crystal biosynthesisrelated: regulation of negative regulation of crystal
> formation Is there a way of putting/annotating this specific item manually,
> so that I can see it?
> If not-
> Is there a way of extracting this GO ID from the list of GO categories, so
> that I can see the results?
>
> Thanks a lot
> Assa
>
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines   grid      stats     graphics  grDevices utils     datasets
> [8] methods   base
>
> other attached packages:
>  [1] GO.db_2.4.1          org.Mm.eg.db_2.4.6   biomaRt_2.6.0
>  [4] Heatplus_1.20.0      gplots_2.8.0         caTools_1.11
>  [7] bitops_1.0-4.1       gdata_2.8.1          gtools_2.6.2
> [10] siggenes_1.24.0      multtest_2.7.1       Rgraphviz_1.29.0
> [13] xtable_1.5-6         annotate_1.28.1      GOstats_2.16.0
> [16] RSQLite_0.9-4        DBI_0.2-5            graph_1.28.0
> [19] Category_2.16.0      AnnotationDbi_1.12.0 Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] genefilter_1.32.0 GSEABase_1.12.1   MASS_7.3-11       RBGL_1.26.0
> [5] RCurl_1.5-0       survival_2.36-5   tcltk_2.12.2      tools_2.12.2
> [9] XML_3.2-0
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list