[BioC] Question on afy/gcrma probe indexes

Park, Richard Richard.Park at joslin.harvard.edu
Wed Apr 28 19:55:27 CEST 2004


I have tried to access the x and y coordinates using xy2i() and i2xy() functions. I would be very cautious about the values you get from these functions. I tried creating a fake .cel file using these functions and the result was never fully correct. 

I eventually had to download some library file from the affymetrix site that had a full list of each x and y value for each probe set. I am unsure where these files lie on the affymetrix site, since they have undergone a significant revision of their site. But probably on average those functions gave me 30-40 percent correct x and y positions. The only way I was able to get a functional fake .cel file was to use the x and y positions given out by affymetrix. 

richard Park 

-----Original Message-----
From: Wolfgang Huber [mailto:w.huber at dkfz-heidelberg.de]
Sent: Wednesday, April 28, 2004 7:22 AM
To: rphaney at bigfoot.com
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Question on afy/gcrma probe indexes

Hi Rich,

Affymetrix uses counting of the x- and y-coordinates that starts at 0, 
and so do the probe packages and the functions xy2i and i2xy from the 
CDF packages. For a historic reason, there is code around in
the affy package that uses coordinates that are incremented by 1.

In AffyBatch objects, x- and y-coordinates are not stored at all: the 
data is stored in a matrix, where columns correspond to different arrays 
and rows to all probes within one array. x and y coordinates can be 
reconstructed from the row index, e.g. by the function i2xy.

Otherwise, can you please be more specific? Which commands do you use to 
get (1.), which to get (2.), and what do you mean by "in affy" (which 
function, or which object)?

Hope that helps,

Rich Haney wrote:
> I am using gcrma with the HG-133A dataset.  When I ask for the location (
> index ) of the first probe I get:
> (1.) Probe = 1007_s_at1
>      Index = 129340    [ The probe is at (x,y) =(467,181) ]
> As I understand it, the probe position is found using the affy routine
> 'xy2i'.  There, the logic for finding a position from x and y is 0-based for
> y and 1-based for x.  So:
> (2.) Index = x + nrows * ( y - 1 )    with nrows = 712 and, as above, x=467
> and y=181
>      Index = 467 + 712 * ( 181 - 1 )
>            = 128627 ( that is, 712 + 1 less than answer given above, 129340
> ).
> So the question is, in affy, is the Index of probes stored with 1-based (
> not 0-based ) y- coordinates, while xy2i assumes 0-based coordinates?
> Thanks for your help!
> ----------------------------------------------------------------------------
> -
> Notes:
> (a.) I believe that this is why my background adjustment is then not
> correct:                
> bg.adjust.optical <- function(abatch,minimum=1,verbose=TRUE){
>   Index <- unlist(indexProbes(abatch,"both"))
>   if(verbose) cat("Adjusting for optical effect")
>   for(i in 1:length(abatch)){
>     if(verbose) cat(".")
>     exprs(abatch)[Index,i] <- exprs(abatch)[Index,i] -
>       min(exprs(abatch)[Index,i],na.rm=TRUE) + minimum
>   }
> (b.) The probe index is created using the following lines of gcrma:
> ##put it in an affybatch
> tmp <- get("xy2i",paste("package:",cdfpackagename,sep=""))
> affinity.info <- new("AffyBatch",cdfName=cdfname)
> pmIndex <-  unlist(indexProbes(affinity.info,"pm"))
> mmIndex <-  unlist(indexProbes(affinity.info,"mm"))
> subIndex <- match(tmp(p$x,p$y),pmIndex)
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

Wolfgang Huber
Division of Molecular Genome Analysis
German Cancer Research Center
Heidelberg, Germany
Phone: +49 6221 424709
Fax:   +49 6221 42524709
Http:  www.dkfz.de/abt0840/whuber

Bioconductor mailing list
Bioconductor at stat.math.ethz.ch

More information about the Bioconductor mailing list