[BioC] problem with GO terms

Marc Carlson mcarlson at fhcrc.org
Tue Nov 29 18:50:01 CET 2011


Hi Ina,

The trouble you are having is not that this GO term is old.  Neither of 
the two GO terms you mentioned has been deprecated from bioconductor as 
of the most recent release.  How do I know this?

library(GO.db)
## this just gets all the terms from the current ontology
bar = keys(GOTERM)
head(bar)
## the following shows that both terms are still in the ontology
"GO:0035637" %in% bar
"GO:0050864" %in% bar

What I think is actually causing you grief is that you are trying to 
look up the term using the wrong mappings.  You have this term and you 
want to see what genes it maps to, so you are looking at the (in the 1st 
example) "org.Hs.egGO2EG" mapping and you really want to be looking in 
the "org.Hs.egGO2ALLEGS" mapping.

So what is the difference?  Well the 1st mapping is for direct GO to 
gene mappings.  This is what we have direct evidence for in the 
database.  So why wouldn't you want to use that in this instance?  
Because GO is a gene "ontology", therefore certain terms can be inferred 
to have a relationship to genes based purely on the fact that their 
child-terms have been directly linked.  And such "indirect" parent terms 
will NOT show up in those GO2EG style mappings.  But they will show up 
in the GO2ALLEGS style of mapping.

So this will NOT work:
get("GO:0050864", org.Hs.egGO2EG)

But this will work:
get("GO:0007597", org.Hs.egGO2ALLEGS)

I believe the same issue is happening with your canine example.  So in 
that case you really want to be using this:

get("GO:0050864", org.Cf.egGO2ALLEGS)


Hope this helps, please let us know if there are any other issues.



   Marc








On 11/28/2011 02:32 PM, James W. MacDonald wrote:
> Hi Ina,
>
> It would be helpful if you would give us a _minimal_  and functional 
> example of what you did. We will also need the set of entrez IDs you 
> used in order to see if we can duplicate. You can output your entrez 
> IDs using the dump() function (e.g., dump("entrezIDs", "")).
>
> Best,
>
> Jim
>
>
>
> On 11/28/2011 5:08 PM, Ina Hoeschele wrote:
>> Hi all,
>>    I am sorry but I still have not been able to solve my problem. I 
>> did a GO analysis using GOstats on another dataset, this time a 
>> canine dataset. The top BP category that I get from GOstats again 
>> does not exist any more! Please see below. I reinstalled everything, 
>> including GOstats, and have the current versions. How is it possible 
>> for GOstats to give me these old categories ...
>>
>>> get("GO:0035637",canine2GO2PROBE)
>> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>>    value for "GO:0035637" not found
>>
>>> sessionInfo()
>> R version 2.14.0 (2011-10-31)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] grid      stats     graphics  grDevices utils     datasets  methods
>> [8] base
>>
>> other attached packages:
>>   [1] org.Hs.eg.db_2.6.4   GOstats_2.20.0       graph_1.32.0
>>   [4] Category_2.20.0      KEGG.db_2.6.1        GO.db_2.6.1
>>   [7] biomaRt_2.10.0       canine2cdf_2.9.1     canine2.db_2.6.3
>> [10] org.Cf.eg.db_2.6.4   RSQLite_0.10.0       DBI_0.2-5
>> [13] annotate_1.32.0      AnnotationDbi_1.16.5 limma_3.10.0
>> [16] made4_1.28.0         scatterplot3d_0.3-33 gplots_2.10.1
>> [19] KernSmooth_2.23-7    caTools_1.12         bitops_1.0-4.1
>> [22] gdata_2.8.2          gtools_2.6.2         RColorBrewer_1.0-5
>> [25] ade4_1.4-17          affy_1.32.0          Biobase_2.14.0
>> [28] BiocInstaller_1.2.1
>>
>> loaded via a namespace (and not attached):
>>   [1] affyio_1.22.0         genefilter_1.36.0     GSEABase_1.16.0
>>   [4] IRanges_1.12.3        preprocessCore_1.16.0 RBGL_1.30.1
>>   [7] RCurl_1.7-0.1         splines_2.14.0        survival_2.36-10
>> [10] tools_2.14.0          XML_3.4-3             xtable_1.6-0
>> [13] zlibbioc_1.0.0
>>
>> ----- Original Message -----
>> From: "James W. MacDonald"<jmacdon at med.umich.edu>
>> To: "Ina Hoeschele"<inah at vbi.vt.edu>
>> Cc: "Bioconductor mailing list"<bioconductor at r-project.org>
>> Sent: Tuesday, November 22, 2011 1:52:56 PM
>> Subject: Re: [BioC] problem with GO terms
>>
>> Hi Ina,
>>
>>
>>
>> On 11/22/2011 1:03 PM, Ina Hoeschele wrote:
>>> thank you, Jim ...
>>> I did what you show below and I get the same result:
>>>
>>> >   get("GO:0050864", org.Hs.egGO2EG)
>>> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>>>      value for "GO:0050864" not found
>>>
>>> but why is GOstats giving me this GO term?
>> Did you use GOstats with this current version of BioC, or are you using
>> data you processed sometime in the past?
>>
>> As far as I can tell, it is impossible for you to be getting that GO
>> term if you are using the current version of these packages. I am
>> assuming that your data are from the Illumina Human V4 chip.
>>
>> >  get("GO:0050864", illuminaHumanv4GO2PROBE)
>> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>>     value for "GO:0050864" not found
>>
>> This is with the version of the illuminaHumanV4.db package that you are
>> using. Since this isn't org.Cf.egGO2ALLEGSeven in that package, it is 
>> not possible for
>> GOstats to be reporting it as being significant.
>>
>> Best,
>>
>> Jim
>>> Thanks again, Ina
>>>
>>>> sessionInfo()
>>> R version 2.14.0 (2011-10-31)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>
>>> locale:
>>> [1] LC_COLLATE=English_United States.1252
>>> [2] LC_CTYPE=English_United States.1252
>>> [3] LC_MONETARY=English_United States.1252
>>> [4] LC_NUMERIC=C
>>> [5] LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>>    [1] biomaRt_2.10.0            GOstats_2.20.0
>>>    [3] graph_1.32.0              Category_2.20.0
>>>    [5] PFAM.db_2.6.1             KEGG.db_2.6.1
>>>    [7] GO.db_2.6.1               annotate_1.32.0
>>>    [9] illuminaHumanv4.db_1.12.1 org.Hs.eg.db_2.6.4
>>> [11] RSQLite_0.10.0            DBI_0.2-5
>>> [13] AnnotationDbi_1.16.4      Biobase_2.14.0
>>> [15] BiocInstaller_1.2.1
>>>
>>> loaded via a namespace (and not attached):
>>>    [1] genefilter_1.36.0 GSEABase_1.16.0   IRanges_1.12.2    
>>> RBGL_1.30.1
>>>    [5] RCurl_1.7-0.1     splines_2.14.0    survival_2.36-10  
>>> tools_2.14.0
>>>    [9] XML_3.4-2.2       xtable_1.6-0
>



More information about the Bioconductor mailing list