[BioC] Methods to access GeneName to ProbeID in package Affy

Mark Dalphin mdalphin at amgen.com
Tue Dec 16 01:40:06 MET 2003


Thank-you Wolfgang for your reply.

I fear that I didn't understand that section of the vignettte so it didn't 
register as the section to return to when I got stuck.

When I tried your second method with the package hgu95aprobe, I don't see the 
connection back to the ProbeID (sorry about the wrapping):

> str(hgu95aprobe)
Classes probetable  and `data.frame':   199091 obs. of  6 variables:
$ sequence                    :Class 'AsIs'  chr [1:199091]
"TCTCCTTTGCTGAGGCCTCCAGCTT" "AGGCCTCCAGCTTCAGGCAGGCCAA" 
"CCAGCTTCAGGCAGGCCAAGGCCTT" "AGCTCAGGTGGCCCCAGTTCAATCT" ...
$ x                           : int  399 544 530 617 459 408 484 548 578 498 
...
$ y                           : int  559 185 505 349 489 545 311 333 369 465 
...
$ Probe.Set.Name              :Class 'AsIs'  chr [1:199091] "1000_at" 
"1000_at" "1000_at" "1000_at" ...
$ Probe.Interrogation.Position: int  1367 1379 1385 1445 1523 1595 1649 1655 
1667 1673 ...
$ Target.Strandedness         : Factor w/ 2 levels "Antisense","Sense": 1 1 1 
1 1 1 1 1 1 1 ...

To prepare a table of ProbeID versus AffyID either for the whole collection or 
a subset, would I rely on the x and y to give me the coordinates of the 
feature on the chip and then use the xy2i() function to compute an index, 
'i', to extract a row from the AffyBatch object?

Thanks,
Mark


=====================================================================================
On Monday 15 December 2003 02:56 pm, w.huber at dkfz-heidelberg.de wrote:
> this is documented in the vignette to the affy package, section 7.
> To repeat it here, you can get the probe set names by
>
>    psets = ls(hgu95av2cdf)
>
> (replace "hgu95av" by whatever your chip name is). The indices of
> the j-th probe set's PM and MM probes by
>
>   get(psets[j], hgu95av2cdf)
>
> and use these to subset the exprs matrix of the AffyBatch object.
>
> Alternatively, a dataframe that maps probe Ids and probe set Ids (and
> provides the sequence as well) is available in the probe packages, e.g.
> hgu95av2probe.
>
> Best wishes
>  Wolfgang
>
> -------------------------------------
> Wolfgang Huber
> Division of Molecular Genome Analysis
> German Cancer Research Center
> Heidelberg, Germany
> Phone: +49 6221 424709
> Fax:   +49 6221 42524709
> Http:  www.dkfz.de/abt0840/whuber
> -------------------------------------
>
> On Mon, 15 Dec 2003, Mark Dalphin wrote:
> > Hi,
> >
> > I'm trying to extract information from an AffyBatch object. I want to
> > prepare a table which contains:
> >
> > AffyID	ProbeID	PM-1	PM-2	PM-3 ...
> >
> > Where:
> > AffyID is also called the GeneName.
> > ProbeID is the ID for the specific reporter on the chip.
> > PM-1, PM-2, ... are the PerfectMatch intensities for several different
> > chips which are part of the the expression set.
> >
> > I can see using the 'pm' method to extract most of this where the ProbeID
> > and PM-1 will be row- and col-names in a matrix (this is great), but then
> > I don't see how to associate the Affy-ID with ProbeID anywhere. A
> > two-column data frame would be just fine.
> >
> > Any help would be appreciated.
> >
> > Thanks,
> > Mark
> >
> > PS I am fairly familliar with R, but many of the data structures in
> > BioConductor seem opaque to me. I believe this is due to the new S4 class
> > structure being used, but I am not certain. Any pointers to how these
> > data are represented and how to browse them would be appreciated; my old
> > stand-by
> >
> > of str() is failing on these data:
> > > str(spikein)
> >
> > List of 59
> >  $ : int [1:59] 1 2 3 4 5 6 7 8 9 10 ...
> >  $ :Error in .subset2(x, i) : subscript out of bounds
> >
> > R version 1.8.0 under RedHat Linux 7.3

-- 
Mark Dalphin                          email: mdalphin at amgen.com
Mail Stop: 29-2-A                     phone: +1-805-447-4951 (work)
One Amgen Center Drive                       +1-805-375-0680 (home)
Thousand Oaks, CA 91320                 fax: +1-805-499-9955 (work)



More information about the Bioconductor mailing list