[BioC] Getting top level GO category for a list of genes

Palle Villesen (BiRC) palle at birc.au.dk
Fri Sep 7 11:02:17 CEST 2007


Hi,

I'm trying to take a list of genes and their attributes (e.g. exon
count) and get average exon count for genes in the different top level
GO categories (1 step below the MF, BP and CC top domain).

After reading the maillists, vignettes and googling around I'm still
baffled - and I thought this would be easy.

I'm using biomaRt to get the GO id's but I think I need to "walk up"
the GO graph and get the near top GO id.


library("biomaRt")
library(GO)

mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")

go1 = getBM(attributes=c("ensembl_gene_id","go","evidence_code"),filter="ensembl_gene_id",values=g2,mart=mart)
goids=go1[,2]

Now I have:
>goids[1:10]
 [1] "GO:0000089" "GO:0000090" "GO:0000093" "GO:0003674" "GO:0005634"
 [6] "GO:0005813" "GO:0005819" "GO:0007049" "GO:0007093" "GO:0051301"
>

I guess I should filter them to get MF ids only, then walk up the
graph and get the 2 level MF id using GOMFANCESTOR somehow (?)

I hope somebody will hint me in the rigth direction or point to some
examplified documentation - if it exists somewhere.

Kind regards,
Palle

-- 
Palle Villesen Fredsted, Assoc. prof., Ph.D.
Bioinformatics Research Center
H. Guldbergs gade 10, build. 1090,
DK-8000 Aarhus C
Contact: +45 8942 3099 / +45 61708600 / www.birc.au.dk



More information about the Bioconductor mailing list