[BioC] affy, probeName and sequence information ?

w.huber at dkfz-heidelberg.de w.huber at dkfz-heidelberg.de
Mon Sep 15 11:40:49 MEST 2003


Hi Laurent

> ... how I can link the "probename" (1007_s_at1) to the corresponding
> sequence ?. The sequence of 1007_s_at1 is "CACC..." or "GCCC.." or
> etc... ? Is there a available relation between the "number" of the probe
> in the Affy package and the position ?

The mapping between probe sequences/positions (e.g. in the probe packages,
or in the data tables from Affymetrix) and the pm and mm intensities is
through the xy2i function:

> print.data.frame(hgu95av2probe[1000,])
                      sequence   x   y Probe.Set.Name
1000 GGTCTACGTCCGAGAGTGAGTGGCC 387 565        1057_at
     Probe.Interrogation.Position Target.Strandedness
1000                          411           Antisense

> xy2i(387, 565)
[1] 361988

> exprs(Dilution)[xy2i(387, 565), ]
..will give you the expression values of that probe

The reverse mapping is obtained by 'i2xy'. The two functions also take
vector arguments.

There are 3 caveats:

1. Presently, the functions 'pm' and 'mm' from the affy package are not
well integrated with this procedure. Suggestions are welcome.

2. There is still a slight bug in the functions 'xy2i' and 'i2xy' that
come with the CDF packages on the webpage. The bug has already been
corrected in the package 'makecdfenv' that produces the CDF packages. It
only concerns probe cells at the very rightmost edge of the chip, thus
should not be critical. The packages will be rebuilt soon. See the thread
in the mailing list:
https://www.stat.math.ethz.ch/pipermail/bioconductor/2003-August/002229.html

3. There are related functions 'xy2indices' and 'indices2xy', which work
just as well. However, they use a different numbering convention for the x
and y coordinates (you have to add 1).

Best regards
  Wolfgang



More information about the Bioconductor mailing list