[BioC] empty go terms with bioMart

chawla chawla at bio.ntnu.no
Mon May 9 12:57:16 CEST 2011


Hi
I am trying to find GO terms (go biological process Ids) for a set of 
2195  unique "affy_hg_u133a_2" probe ids.

 >goterms=getBM(attributes = c("affy_hg_u133a_2", 
"go_biological_process_id","entrezgene"), filters = "affy_hg_u133a_2", 
values = data[,1], mart = ensembl)
 > head(goterms)
   affy_hg_u133a_2 go_biological_process_id entrezgene
1       209891_at               GO:0051301      57405
2       209891_at               GO:0007052      57405
3       209891_at               GO:0007059      57405
4       209891_at               GO:0007049      57405
5       209891_at               GO:0007067      57405
6       206204_at               GO:0007165       2888

 > dim(goterms)
[1] 15088     3

 > length(unique(goterms[,1]))
[1] 1875

 > length(which(goterms[,2]==""))
[1] 1222

My question is if out of 2195 unique probe ids, 1875 genes have the go 
terms for biological process id and are present in the result, but then 
why 1222 rows have "" as biological process id.
They should simply be absent from the result, is something wrong ?
  if not I will have to filter them each time I use Biomart for Go terms 
extraction.
The same problem occurred with yeast and rat data.
Thanks in advance
Konika



More information about the Bioconductor mailing list