[BioC] how to get gene list for given GO terms?

Martin Morgan mtmorgan at fhcrc.org
Wed Feb 29 23:33:38 CET 2012


On 02/29/2012 11:13 AM, Ou, Jianhong wrote:
> Hi Martin,
>
> Thank you for your reply.
>
> The question may be divided into two parts. The first part is like what you replied. The second one is that maybe the given GO term is the ancestor of other GO terms which are not annotated in the org.Hs.egGO db.
>
> I what I did for this is that fist map all the gene entrez_id into GO terms and get all the ancestor of the GO terms. Then go back to extract all the genes involved in one GO term.

I don't really understand your question so probably shouldn't try to 
answer, but if I knew a GO term GO:0006281 I could find out about it and 
all its offspring (or immediate children, if I used GOBPCHILDREN)

 > library(GO.db)
 > GOTERM[["GO:0006281"]]
GOID: GO:0006281
Term: DNA repair
Ontology: BP
Definition: The process of restoring DNA after damage. Genomes are
     subject to damage by chemical and physical agents in the
     environment (e.g. UV and ionizing radiations, chemical mutagens,
     fungal and bacterial toxins, etc.) and by free radicals or
     alkylating agents endogenously generated in metabolism. DNA is also
     damaged because of errors during its replication. A variety of
     different DNA repair pathways have been reported that include
     direct reversal, base excision repair, nucleotide excision repair,
     photoreactivation, bypass, double-strand break repair pathway, and
     mismatch repair pathway.
 > off = GOBPOFFSPRING[["GO:0006281"]]
 > length(off)
[1] 91
 > genes = mget(GOBPOFFSPRING[["GO:0006281"]], revmap(org.Hs.egGO), 
ifnotfound=NA)
 > head(genes[!is.na(genes)], 3)
$`GO:0000012`
         IDA         IDA         IEA         IDA         IDA         IMP
      "3981"      "7141"      "7515"     "54840"     "55775"     "55775"
         IMP         IEA
    "200558" "100133315"

$`GO:0000710`
    IBA    IBA    ISS    IBA    IBA
"2072" "2956" "2956" "4436" "4437"

$`GO:0000715`
    IDA    TAS    IDA    TAS
"5887" "5887" "7508" "7508"

Packages here

   http://bioconductor.org/packages/release/BiocViews.html#___GO

might be helpful, for instance goTools.

Martin

> I will appreciated if there is any package can do this.
>
> Yours sincerely,
>
> Jianhong Ou
>
> jianhong.ou at umassmed.edu
>
>
> On Feb 29, 2012, at 1:47 PM, Martin Morgan wrote:
>
>> On 02/27/2012 04:41 PM, Ou, Jianhong wrote:
>>> Hello All,
>>>
>>> Is there any package can extract a gene list for a given GO term from human genome?
>>
>>
>> For instance
>>
>> library(org.Hs.egGO)
>> revmap(org.Hs.egGO)[["GO:0042254"]]
>>      IEA      IEA      IEA      IEA      IEA      IBA      IEA      IEA
>>    "705"   "3692"   "4809"   "6130"   "6175"   "6222"   "6838"  "10171"
>>      IEA      IEA       IC      NAS      IEA      IEA      IEA      IEA
>> "10969"  "23212"  "23246"  "23246"  "23560"  "26164"  "26284"  "26574"
>>      IEA      IEA      IEA      IEA      IEA      IEA      IEA      IEA
>> "29102"  "29889"  "51154"  "51187"  "54552"  "54680"  "55299"  "79631"
>>      IEA      IEA      IEA      IDA      IEA
>> "81875"  "84864"  "85865"  "92345" "124995"
>>
>> see
>>
>>   vignette(package="AnnotationDbi", "AnnotationDbi")
>>
>> Martin
>>
>>>
>>> Thanks a lot.
>>>
>>> Yours sincerely,
>>>
>>> Jianhong Ou
>>>
>>> jianhong.ou at umassmed.edu
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> --
>> Computational Biology
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>
>> Location: M1-B861
>> Telephone: 206 667-2793
>
>


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioconductor mailing list