[BioC] how does an annotation package handle ambigious probe set id mappings

James W. MacDonald jmacdon at med.umich.edu
Mon Oct 19 19:00:57 CEST 2009


Hi Andrew,

Andrew Yee wrote:
> Apologies if this has been asked before, but how does an annotation
> package handle an ambiguous probe set ID mapping?
> 
> Take for example the Affymetrix chip U133X3P.
> 
> When I use the annotation for this chip for probe set ID
> 1552641_3p_s_at, it returns only one match:
> 
>> library('u133x3p.db')
>> mget('1552641_3p_s_at', env=u133x3pSYMBOL)
> $`1552641_3p_s_at`
> [1] "ATAD3B"
>> mget('1552641_3p_s_at', env=u133x3pENTREZID)
> $`1552641_3p_s_at`
> [1] "83858"
> 
> However, when I search Affymetrix, with:
> 
> https://www.affymetrix.com/analysis/netaffx/fullrecord.affx?pk=U133_X3P:1552641_3P_S_AT
> 
> it states that it ambiguously maps to three gene symbols, ATAD3A,
> ATAD3B, and LOC732419.
> 
> How does the annotation package determine which gene symbol it should map to?

In the past we just used the first probeset ==> Entrez Gene ID mapping. 
However, in the soon to be released BioC 2.5 annotation packages all the 
mappings are included (thanks to Marc Carlson).

 > tmp <- toggleProbes(u133x3pENTREZID, "all")
 > get('1552641_3p_s_at', tmp)
[1] "55210"  "732419" "83858"
 > tmp2 <- toggleProbes(u133x3pSYMBOL, "all")
 > get('1552641_3p_s_at', tmp2)
[1] "ATAD3A"    "LOC732419" "ATAD3B"

Oddly enough, this probeset isn't mapped in the 'regular' mappings:

 > get('1552641_3p_s_at', u133x3pENTREZID)
[1] NA
 > get('1552641_3p_s_at', u133x3pSYMBOL)
[1] NA

Marc?

 > sessionInfo()
R version 2.10.0 Under development (unstable) (2009-09-21 r49780)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] u133x3p.db_2.3.5     org.Hs.eg.db_2.3.4   RSQLite_0.7-2
[4] DBI_0.2-4            AnnotationDbi_1.7.17 Biobase_2.5.6

loaded via a namespace (and not attached):
[1] tools_2.10.0
 >

Best,

Jim


> 
> Thanks,
> Andrew
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826



More information about the Bioconductor mailing list