[BioC] How to go from affymetrix to Ensembl transcript IDs
mailinglist.honeypot at gmail.com
Fri Apr 10 00:01:03 CEST 2009
On Apr 9, 2009, at 5:40 PM, Peter Robinson wrote:
> Hi all,
> sorry if this is a dumb question, but rtfm has not helped so far.
> I would like to get the Ensembl transcript IDs that correspond to
> affymetrix probeset ids using biomaRt. As a test case, I am using
> the ALL data set from bioconductor. My code:
> data("ALL") ## Note this dataset uses hgu95av2 Affymetrix chip
> dat <- exprs(ALL)
> affyids = rownames(dat)
> ## get mapping data from Ensembl via bioMaRt
> ensembl <- useMart("ensembl")
> ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)
> mapping <- getBM(attributes = c("affy_hg_u95av2",
> "ensembl_transcript_id"), filters = "affy_hg_u95av2",
> values = affyids, mart = ensembl)
> Here is where the problem is. The "mapping" seems to be a random
> collection of transcript IDs.
Your query is right, so ... your results are not random. You can
double check by trying the small example in the ?getBM help.
Anyway: that probe looks a-weird one. Even affy maps it to several
You will need an Affy NetAffx account to see that. Some relevant stats
from that page are that the probe maps to 6 different ensembl IDs.
It even aligns to two different places:
You'll probably find this for many probes, so you'll need some policy
to deal with that.
Hope that helps,
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University
More information about the Bioconductor