[BioC] Getting probes id for particular probeset id

Mark marek.piatek at bbsrc.ac.uk
Wed Feb 10 12:05:44 CET 2010


Hi all,

I’m trying to get probes for particular probset id from my MoGene arrays. From
experiment description file (dabg.summary.txt) I can see that there are around
241,500 probset ids for my 12 arrays. When loading .CEL files into bioconductor
I see 1,102,500 values for my 12 arrays. Thus I think there should be around 4
probes per 1 probeset on average. 

However, when I load an experiment description file into an AnnotatedDataFrame
object:

Affy.Expt <- read.AnnotatedDataFrame("dabg.summary.txt", header=TRUE,
row.names=1, sep="\t")

and try to use it as my phenoData when loading .CEL files into Affybatch object :

Affy.Data <- ReadAffy(filenames=colnames(pData(Affy.Expt)), phenoData=Affy.Expt,
verbose=TRUE)

I get an error:

Warning message:
In read.affybatch(filenames = l$filenames, phenoData = l$phenoData,  :
  Incompatible phenoData object. Created a new one.

I understand that as a not consistent number of rows between my experiment
description file (241,500 probset ids) and number of rows in .CEL files
(1,102,500 probes). When it does that it resets the probsets id and starts
numbering the rows from 1 to 1,102,500 and thus loosing track of probset ids.

The point is that I need to know which probes belong to which probeset id and
have their values stored.

I looked at CDF file but it looks strange and I can’t get anything useful from
there. I thought that maybe looking into rma algorithm will help me out somehow,
but it calls external function, which I don’t understand.
Is there some easy way to get that information?

Thank you in advance,
Mark



More information about the Bioconductor mailing list