[BioC] How to filter a list of genes by their ontology

James W. MacDonald jmacdon at uw.edu
Mon Sep 17 20:51:54 CEST 2012


Hi Laurent,

On 9/17/2012 11:15 AM, Laurent Pays [guest] wrote:
> Hi,
> After analysing my arrays for differentially expressed genes, I get a list of genes Id. To reduce even more the number of genes in this list, I would like to retain only genes related to the immune system. I've looked for packages dealing with "ontology" but I couldn't find any doing this simple task...

It's not really a simple task. If you want to assume that GO terms 
fulfill your criteria, then you can look for terms that contain the word 
'immune'. You don't say what species you are working with, so I'll 
assume Homo sapiens.

 > library(org.Hs.eg.db)
## fake up some gene IDs
 > egids <- Lkeys(org.Hs.egSYMBOL)[sample(1:2e4, 500)]
 > gos <- mget(egids, org.Hs.egGO)
 > goterms <- sapply(gos,  function(x) if(!is.null(names(x))) 
Term(names(x)))
 > ind <- sapply(goterms, function(x) length(grep("immune", x))) > 0
 > sum(ind)
[1] 17
 > egids[ind]
  [1] "159"   "3934"  "3557"  "4057"  "3806"  "8742"  "8876"  "10581" 
"57115"
[10] "55593" "6352"  "959"   "9865"  "3452"  "841"   "3608"  "6935"

Best,

Jim



>
> Any idea on what package/function I could use?
>
> Thanks in advance for your help.
>
> L.P
>
>   -- output of sessionInfo():
>
> R version 2.7.0 (2008-04-22)
> powerpc-apple-darwin8.10.1
>
> locale:
> fr_FR.UTF-8/fr_FR.UTF-8/C/C/fr_FR.UTF-8/fr_FR.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list