[BioC] localization of mm values in affybatch exprs matrix

lgautier at altern.org lgautier at altern.org
Sun Jan 28 04:28:16 CET 2007


As Kasper points it out using information stored in the CDF
looks very much like the only safe solution,
rather that relying on the observed fact a MM is beside its
corresponding PM, since I do not think it is otherwise claimed
to be this way by the manufacturer of the chips.

Probe-level data in an AffyBatch are stored in a matrix, having
one probe per row and one chip per column.
The method "indexProbes" for "AffyBatch" will return you indexes
in that matrix.
If you what the X/Y coordinates for an index a convenient way
is to use the function "indices2xy"

Example:

#your AffyBatch being "abatch"

imm <- indexProbes(abatch, which="mm")
xymm <- indices2xy(imm, abatch=abatch)





Hoping this helps,



Laurent




> So I was a bit quick. It seems from Karin's post that she already has
> a CDF env.
>
> Jim's statement that the CEL file is ordered PM then MM is probably
> right for most chips, but in general you can only be sure that the PM
> and the MM are spatially close. In general you should use the CDF
> information to link the pm/mm/(x,y) position together and you cannot
> a priori know what coordinate corresponds to in terms of pm/mm/probeset.
>
> Kasper
>
> On Jan 26, 2007, at 2:56 PM, Kasper Daniel Hansen wrote:
>
>>
>> On Jan 25, 2007, at 10:10 AM, James W. MacDonald wrote:
>>
>>> Hi Karin,
>>>
>>> Karin Lagesen wrote:
>>>> I have a custom affy chip that I read into R using ReadAffy():
>>>>
>>>>> newdata = ReadAffy()
>>>>> newdata
>>>>
>>>> AffyBatch object
>>>> size of arrays=754x754 features (17777 kb)
>>>> cdf=E_colia530222N (11378 affyids)
>>>> number of samples=4
>>>> number of genes=11378
>>>> annotation=ecolia530222n
>>>>
>>>>
>>>> I now want to look at different values in this object.
>>>>
>>>> For instance, some pm values:
>>>>
>>>>
>>>>> pm(newdata)[1:5,]
>>>>
>>>>      chip1.CEL chip2.CEL chip3.CEL chip4.CEL
>>>> [1,]    1855.0    2180.8    1444.0  2932.0
>>>> [2,]    2812.0    3451.0    2276.5  3406.0
>>>> [3,]    4162.3    4301.0    2996.0  5088.0
>>>> [4,]    1608.5    1758.0    1123.0  1987.0
>>>> [5,]    2290.0    3189.0    2474.5  2838.3
>>>>
>>>>
>>>> I now also look at the values in the affybatch exprs matrix:
>>>>
>>>>
>>>>> newdata at exprs[1:5,]
>>>>
>>>>      chip1.CEL chip2.CEL chip3.CEL chip4.CEL
>>>> [1,]     942.0     776.0       281    1475
>>>> [2,]   24422.0   26071.0      8914   21826
>>>> [3,]    1024.5     908.8       227    1594
>>>> [4,]   26267.0   27674.0     16199   22104
>>>> [5,]     130.0     193.0       168     145
>>>>
>>>>
>>>> I also notice that the dimension of the exprs matrix is such that
>>>> there is one column for each chip, and as many rows as there are pm
>>>> plus mm values.
>>>>
>>>> Are the first half of rows the pm values, with the mm values
>>>> following, or are the pm values every other row with the
>>>> corresponding
>>>> mm value below, or is this set up in some other way? Is there any
>>>> way
>>>> for me to look at a value in the exprs matrix and find out which
>>>> entry
>>>> in the pm/mm value list it is?
>>>
>>> The chip is read in row-wise, and the PM probes are in a given row,
>>> with
>>> the MM probes in the following row. Therefore, the data (excluding
>>> the
>>> various QC probes) will be N PM probes followed by N MM probes,
>>> where N
>>> is the row length of the chip.
>>
>> This is not true I believe. The are no clear order of the pm and
>> mm's. You need to get that information from somewhere else, usually
>> from a CDF file.
>>
>> Karin: you will need to use the makecdfenv package to make what is
>> called a CDF package - an R representation of the PM/MM/probeset
>> pairs.
>>
>> Kasper
>>
>>
>>> If you really want to work with the exprs matrix directly (why?), you
>>> can use indexProbes() to find the indices for whatever probeset you
>>> are
>>> interested in, and then subset out. Alternatively you can get the
>>> indices for the PM and MM probes and subset those out separately
>>> (which
>>> is how pm() and mm() work). You can also use pm() or mm() with an
>>> optional genenames argument to get the PM or MM probe values for a
>>> particular probeset or probesets.
>>>
>>>
>>> Best,
>>>
>>> Jim
>>>
>>>
>>>>
>>>> TIA,
>>>>
>>>> Karin
>>>
>>>
>>> --
>>> James W. MacDonald, M.S.
>>> Biostatistician
>>> Affymetrix and cDNA Microarray Core
>>> University of Michigan Cancer Center
>>> 1500 E. Medical Center Drive
>>> 7410 CCGC
>>> Ann Arbor MI 48109
>>> 734-647-5623
>>>
>>>
>>> **********************************************************
>>> Electronic Mail is not secure, may not be read every day, and
>>> should not be used for urgent or sensitive issues.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/
>>> gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/
>> gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> !DSPAM:45ba8dff19191804284693!
>
>
>



More information about the Bioconductor mailing list