[BioC] how to get gene list for given GO terms?

Marc Carlson mcarlson at fhcrc.org
Thu Mar 1 01:30:23 CET 2012


Hi Jianhong,

The 1st example Martin showed will get you answers concerning the 
immediate GO associations.  When considering the possibility that your 
GO term may be an ancestor and performing the same kind of operation 
with consideration for that fact, please see the GO2ALLEGS mapping:

library(org.Hs.eg.db)
help("org.Hs.egGO2ALLEGS")

## then to get your entrez gene IDs you can just use this:
mget(c("GO:0042254"),org.Hs.egGO2ALLEGS)

I hope this helps,


   Marc



On 02/29/2012 02:33 PM, Martin Morgan wrote:
> On 02/29/2012 11:13 AM, Ou, Jianhong wrote:
>> Hi Martin,
>>
>> Thank you for your reply.
>>
>> The question may be divided into two parts. The first part is like 
>> what you replied. The second one is that maybe the given GO term is 
>> the ancestor of other GO terms which are not annotated in the 
>> org.Hs.egGO db.
>>
>> I what I did for this is that fist map all the gene entrez_id into GO 
>> terms and get all the ancestor of the GO terms. Then go back to 
>> extract all the genes involved in one GO term.
>
> I don't really understand your question so probably shouldn't try to 
> answer, but if I knew a GO term GO:0006281 I could find out about it 
> and all its offspring (or immediate children, if I used GOBPCHILDREN)
>
> > library(GO.db)
> > GOTERM[["GO:0006281"]]
> GOID: GO:0006281
> Term: DNA repair
> Ontology: BP
> Definition: The process of restoring DNA after damage. Genomes are
>     subject to damage by chemical and physical agents in the
>     environment (e.g. UV and ionizing radiations, chemical mutagens,
>     fungal and bacterial toxins, etc.) and by free radicals or
>     alkylating agents endogenously generated in metabolism. DNA is also
>     damaged because of errors during its replication. A variety of
>     different DNA repair pathways have been reported that include
>     direct reversal, base excision repair, nucleotide excision repair,
>     photoreactivation, bypass, double-strand break repair pathway, and
>     mismatch repair pathway.
> > off = GOBPOFFSPRING[["GO:0006281"]]
> > length(off)
> [1] 91
> > genes = mget(GOBPOFFSPRING[["GO:0006281"]], revmap(org.Hs.egGO), 
> ifnotfound=NA)
> > head(genes[!is.na(genes)], 3)
> $`GO:0000012`
>         IDA         IDA         IEA         IDA         IDA         IMP
>      "3981"      "7141"      "7515"     "54840"     "55775"     "55775"
>         IMP         IEA
>    "200558" "100133315"
>
> $`GO:0000710`
>    IBA    IBA    ISS    IBA    IBA
> "2072" "2956" "2956" "4436" "4437"
>
> $`GO:0000715`
>    IDA    TAS    IDA    TAS
> "5887" "5887" "7508" "7508"
>
> Packages here
>
>   http://bioconductor.org/packages/release/BiocViews.html#___GO
>
> might be helpful, for instance goTools.
>
> Martin
>
>> I will appreciated if there is any package can do this.
>>
>> Yours sincerely,
>>
>> Jianhong Ou
>>
>> jianhong.ou at umassmed.edu
>>
>>
>> On Feb 29, 2012, at 1:47 PM, Martin Morgan wrote:
>>
>>> On 02/27/2012 04:41 PM, Ou, Jianhong wrote:
>>>> Hello All,
>>>>
>>>> Is there any package can extract a gene list for a given GO term 
>>>> from human genome?
>>>
>>>
>>> For instance
>>>
>>> library(org.Hs.egGO)
>>> revmap(org.Hs.egGO)[["GO:0042254"]]
>>>      IEA      IEA      IEA      IEA      IEA      IBA      IEA      IEA
>>>    "705"   "3692"   "4809"   "6130"   "6175"   "6222"   "6838"  "10171"
>>>      IEA      IEA       IC      NAS      IEA      IEA      IEA      IEA
>>> "10969"  "23212"  "23246"  "23246"  "23560"  "26164"  "26284"  "26574"
>>>      IEA      IEA      IEA      IEA      IEA      IEA      IEA      IEA
>>> "29102"  "29889"  "51154"  "51187"  "54552"  "54680"  "55299"  "79631"
>>>      IEA      IEA      IEA      IDA      IEA
>>> "81875"  "84864"  "85865"  "92345" "124995"
>>>
>>> see
>>>
>>>   vignette(package="AnnotationDbi", "AnnotationDbi")
>>>
>>> Martin
>>>
>>>>
>>>> Thanks a lot.
>>>>
>>>> Yours sincerely,
>>>>
>>>> Jianhong Ou
>>>>
>>>> jianhong.ou at umassmed.edu
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: 
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>> -- 
>>> Computational Biology
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>>
>>> Location: M1-B861
>>> Telephone: 206 667-2793
>>
>>
>
>



More information about the Bioconductor mailing list