[BioC] How to go from affymetrix to Ensembl transcript IDs

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Apr 10 00:01:03 CEST 2009

Hi Peter,

On Apr 9, 2009, at 5:40 PM, Peter Robinson wrote:

> Hi all,
> sorry if this is a dumb question, but rtfm has not helped so far.
> I would like to get the Ensembl transcript IDs that correspond to  
> affymetrix probeset ids using biomaRt. As a test case, I am using  
> the ALL data set from bioconductor. My code:
> library("biomaRt")
> library("ALL")
> data("ALL")  ## Note this dataset uses hgu95av2 Affymetrix chip
> dat <- exprs(ALL)
> affyids = rownames(dat)
> ## get mapping data from Ensembl via bioMaRt
> ensembl <- useMart("ensembl")
> ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)
> mapping <- getBM(attributes = c("affy_hg_u95av2",  
> "ensembl_transcript_id"), filters = "affy_hg_u95av2",
>   values = affyids, mart = ensembl)
> Here is where the problem is. The "mapping" seems to be a random  
> collection of transcript IDs.

Your query is right, so ... your results are not random. You can  
double check by trying the small example in the ?getBM help.

Anyway: that probe looks a-weird one. Even affy maps it to several  
locations. See:


You will need an Affy NetAffx account to see that. Some relevant stats  
from that page are that the probe maps to 6 different ensembl IDs.

It even aligns to two different places:


You'll probably find this for many probes, so you'll need some policy  
to deal with that.

Hope that helps,

Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University


More information about the Bioconductor mailing list