[BioC] mistmatch in GO terms between topGO_1.14.0 and org.Mm.eg.db_2.3.6

Dick Beyer dbeyer at u.washington.edu
Wed Mar 3 01:15:33 CET 2010


Hello,

I've been running topGO (using mouse Entrez Gene IDs) and found that some GO terms that turn up in the topGO analysis are not in the GO terms from org.Mm.eg.db.

I'd like to give some example code to show how to generate the problem, but my topGO code is a lot of lines.  The output looks like:

allResults[[1]][[1]][1:2,]
         GO.ID                                Term Annotated Significant Expected classic    elim weight
714 GO:0019222     regulation of metabolic process      2498         143   107.08 0.00010 0.17956 0.9057
762 GO:0006807 nitrogen compound metabolic process      3413         186   146.31 0.00011 0.45337 0.9434

So, the topGO output gives a column of GOIDs and such.  

Some of the problem GOIDs from topGO are GO:0030522, GO:0051094, GO:0031497, GO:0046700.  

I can't find these in names(Mm.egGO2EG).

library("org.Mm.eg.db")
Mm.egGO2EG <- as.list(org.Mm.egGO2EG)
grep("GO:0030522",names(Mm.egGO2EG))
integer(0)

Is it possible that topGO depends on GO.db, and I'm using org.Mm.eg.db?  When I check for GO:0030522 for Mus musculus at geneontology.org, GO:0030522 is valid.

I'm puzzled by the mismatch.  I want to get the genes for a given GOID, so there is probably a work around.  If anyone has a suggestion or idea, I'd be very grateful to know what to try.

Thanks very much,
Dick

Here is my session info:

sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-redhat-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=C
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] limma_3.2.1         topGO_1.14.0        SparseM_0.83        graph_1.24.1        GO.db_2.3.5         org.Mm.eg.db_2.3.6  RSQLite_0.7-3
 [8] DBI_0.2-4           AnnotationDbi_1.8.1 Biobase_2.6.0       biomaRt_2.2.0       gplots_2.7.4        caTools_1.10        bitops_1.0-4.1
[15] gdata_2.6.1         gtools_2.6.1

loaded via a namespace (and not attached):
[1] lattice_0.17-26 RCurl_1.3-0     tools_2.10.0    XML_2.6-0

*******************************************************************************
Richard P. Beyer, Ph.D.	University of Washington
Tel.:(206) 616 7378	Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696	4225 Roosevelt Way NE, # 100
			Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
http://staff.washington.edu/~dbeyer



More information about the Bioconductor mailing list