[BioC] Unambiguously mapping of affy IDs to gene symbols using hgu133plus2.db

Christian Ruckert cruckert at uni-muenster.de
Fri Oct 1 12:10:50 CEST 2010

I am doing some mapping of affymetrix probeset IDs to gene symbols using 
package hgu133plus2.db.

As the following example illustrates, each of the 40686 mapped probesets 
maps to exactly one gene symbol.

 > library("hgu133plus2.db")
 > x <- hgu133plus2SYMBOL
 > Llength(x)
[1] 54675
 > count.mappedkeys(x)
[1] 40686

 > head(nhit(x))
1007_s_at   1053_at    117_at    121_at 1255_g_at   1294_at
         1         1         1         1         1         1

 > table(nhit(x))

     0     1
13989 40686

Am I correct, that annotation with gene symbol is only included in the 
package if it is unambiguously?

For example
 > x[["203074_at"]]
[1] NA

But netaffx and biomart return:

If doing a mapping between protein and gene expression arrays based on 
gene symbols, can results be improved using biomart instead of the 
annotation packages?


 > sessionInfo()
R version 2.11.0 (2010-04-22)

[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] hgu133plus2.db_2.4.1 org.Hs.eg.db_2.4.1   RSQLite_0.9-1
[4] DBI_0.2-5            AnnotationDbi_1.10.1 Biobase_2.8.0

loaded via a namespace (and not attached):
[1] tools_2.11.0

