[BioC] affymetrix probe databases

Pau Marc Muñoz Torres paumarc at gmail.com
Mon Sep 15 16:40:03 CEST 2014


Good afternoon to everybody,

 I'm just doing my firsts steps working with  affymetrix data and i have
some questions.

 I  started to working with CEL files by moving the data contained in them
to a csv file. Then I tried to relate affymetrix codes with uniprot codes.
I  performed it as follow:

library(affy)
library("xxxx.db")
setwd("/home/paumarc/affy/Data/Exp/Cel")
data <- ReadAffy()
my_frame <- data.frame(exprs(data))
Annot <- data.frame(ACCNUM=sapply(contents(xxxACCNUM), paste, collapse=",
"), SYMBOL=sapply(contents(xxxSYMBOL), paste, collapse=", "),
DESC=sapply(contents(xxxGENENAME), paste, collapse=",
"),DESC=sapply(contents(xxxUNIPROT), paste, collapse=", "))
all <- merge(Annot, my_frame, by.x=0, by.y=0, all=T)
write.csv(all, file = "xxx.csv")

where XXX is one of the follwing database

hgu133a.db
hgu133b.db
pd.hg.u133.plus.2
hgu133plus2.db

 unfortuntally the codes cointained at the CEL  files and the database do
not match well. Some examples are:

137707    208130    NA    NA    NA    NA    202.5    192    209.3    313.8
137708    208130_s_at    NM_030984    TBXAS1    thromboxane A synthase 1
(platelet)    P24557, Q53F23, B4DJG6, Q16843    NA    NA    NA    NA
137709    208131    NA    NA    NA    NA    187.5    194.5    167.5    224
137710    208131_s_at    NM_000961    PTGIS    prostaglandin I2
(prostacyclin) synthase    Q16647    NA    NA    NA    NA

I suppose that     208131_s_at should mach with 208131  and 208130_s_at
with 208130, is that supposition correct? does some body knows to which
affymetrics database correspond the identification codes 208130_s_at and
208131_s_at

Thank you

pau

Pau Marc Muñoz Torres
skype: pau_marc
http://www.linkedin.com/in/paumarc
http://www.researchgate.net/profile/Pau_Marc_Torres3/info/

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list