[BioC] (no subject)

James Perkins jperkins at biochem.ucl.ac.uk
Tue Mar 13 16:13:21 CET 2012


Dear list,

Apologies if this is posted twice, I accidentally sent before to
bioconductor at stat.math.ethz.ch

Has anyone ever used biomart to get the ENSEMBL ids for the Affymetrix
rat exon array probes at the transcript cluster level:

i.e.

library(oligo)
exonCELs <- list.celfiles("where the rat CEL files are", full.names=TRUE)
affyExonFS <- read.celfiles(exonCELs)
exonCore <- rma(affyExonFS, target = "core")
# featureData(exonCore) <- getNetAffx(exonCore, "transcript")

library(biomaRt)
mart <- useMart("ensembl", dataset="rnorvegicus_gene_ensembl")
genes <- getBM(attributes = c("ensembl_gene_id",
"affy_raex_1_0_st_v1"), filters="affy_raex_1_0_st_v1",
values=featureNames(exonCore), mart=mart)

I do it but only get ensembl ids for a few of the transcripts.

> dim(genes)
[1] 370   2
> length(featureNames(exonCore))
[1] 8793

I don't understand this. In particular, I noticed any transcript id of
7xxxxxx would not return an ensembl id.

I wondered if it might be that affy_raex_1_0_st_v1 summarises at the
probeset or exon level, but in that case surely nothing would be
returned by the getBM call?

Any help would be much appreciated.

I couldn't find an easy way to do the same thing using oligo /
getNetAffx() but I may not have looked hard enough! My next option is
to try xmapcore.

Cheers,

Jim

> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] biomaRt_2.10.0          pd.raex.1.0.st.v1_3.4.0 RSQLite_0.11.1
[4] DBI_0.2-5               oligo_1.18.1            oligoClasses_1.16.0
[7] Biobase_2.14.0

loaded via a namespace (and not attached):
 [1] affxparser_1.26.4     affyio_1.22.0         Biostrings_2.22.0
 [4] bit_1.1-8             ff_2.2-5              IRanges_1.12.6
 [7] preprocessCore_1.16.0 RCurl_1.91-1          splines_2.14.1
[10] XML_3.9-4             zlibbioc_1.0.1



More information about the Bioconductor mailing list