[BioC] How to filter a list of genes by their ontology

Martin Morgan mtmorgan at fhcrc.org
Mon Sep 17 20:53:23 CEST 2012


On 09/17/2012 11:51 AM, James W. MacDonald wrote:
> Hi Laurent,
>
> On 9/17/2012 11:15 AM, Laurent Pays [guest] wrote:
>> Hi,
>> After analysing my arrays for differentially expressed genes, I get a
>> list of genes Id. To reduce even more the number of genes in this
>> list, I would like to retain only genes related to the immune system.
>> I've looked for packages dealing with "ontology" but I couldn't find
>> any doing this simple task...
>
> It's not really a simple task. If you want to assume that GO terms
> fulfill your criteria, then you can look for terms that contain the word
> 'immune'. You don't say what species you are working with, so I'll
> assume Homo sapiens.
>
>  > library(org.Hs.eg.db)
> ## fake up some gene IDs
>  > egids <- Lkeys(org.Hs.egSYMBOL)[sample(1:2e4, 500)]
>  > gos <- mget(egids, org.Hs.egGO)
>  > goterms <- sapply(gos,  function(x) if(!is.null(names(x)))
> Term(names(x)))
>  > ind <- sapply(goterms, function(x) length(grep("immune", x))) > 0
>  > sum(ind)
> [1] 17
>  > egids[ind]
>   [1] "159"   "3934"  "3557"  "4057"  "3806"  "8742"  "8876"  "10581"
> "57115"
> [10] "55593" "6352"  "959"   "9865"  "3452"  "841"   "3608"  "6935"
>
> Best,
>
> Jim
>
>
>
>>
>> Any idea on what package/function I could use?
>>
>> Thanks in advance for your help.
>>
>> L.P
>>
>>   -- output of sessionInfo():
>>
>> R version 2.7.0 (2008-04-22)
>> powerpc-apple-darwin8.10.1

Also, this R is VERY out of date, and the first thing you'll want to do 
is update it. Martin


>>
>> locale:
>> fr_FR.UTF-8/fr_FR.UTF-8/C/C/fr_FR.UTF-8/fr_FR.UTF-8
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list