[BioC] Converting makeProbePackage in AnnotationDbi to permit PM-only arrays

Ariel Grostern ariel.grostern at berkeley.edu
Thu Jun 23 23:36:44 CEST 2011


Hi,
I am attempting to adapt the makeProbePackage function in AnnotationDbi to 
permit me to make a probe package for my PM-only Affy array, so that I can 
then use altcdfenvs to re-annotate the probes. My probe sequence file is set 
up exactly as the normal tab-limited PM/MM file.

>From examining the makeProbePackage.R file, it appears that there is only 
limited referral to MM probes, so I am eliminating them (see below). 
However, there is one section that I don't know what to do with:

 ## On most chips, PM and MM probe are next to each other on the chip, at 
same
  ## x coordinate and at adjacent y coordinates. Then, "sizex" is always the 
same,
  ## namely the size of the chip in x-direction. On some chips, there are 
few
  ## exceptions.
tab = table(mm1-pm1)
## WHAT TO DO FOR THE ABOVE LINE?
  sizex = as.numeric(names(tab))[ max(tab)==tab ]

can I replace the "tab" line with:
tab=table(pm1)

Thanks for any input,

Ariel

## ----------------------------------------------------------------------
## The table pt contains a probe to probe-set mapping (many-to-one).
## The CDF environment contains a probe-set to probe mapping (one-to-many).
## Here, we check whether they agree.
## In addition, it uses the information in the CDF to guess
## sizex, the size of the chip in x-direction.
## This is done using the fact that with current-day Affymetrix
## for each PM probe at (x,y) there is a MM probe at (x,y+1).
## (C) Laurent Gautier, Wolfgang Huber 2003
## ----------------------------------------------------------------------
.lgExtraParanoia = function (pt, cdfname) {
  do.call(library, list(cdfname))

  thecdf <- as.environment(paste("package", cdfname, sep=":"))[[cdfname]]

  ## Unroll CDF in order to invert the mapping from probe-set -> probe
  ## to probe -> probe-set. psnm1[i] is the probe set name for the i-th 
probe
  probesetnames = ls(thecdf)
  pm1   = unlist(lapply(probesetnames,
    function(ps) {thecdf[[ps]][,1]}))
## Crossed out the mm1 function

##mm1   = unlist(lapply(probesetnames,
  ##  function(ps) {thecdf[[ps]][,2]}))
  psnm1 = unlist(lapply(probesetnames,
    function(ps) {rep(ps, nrow(thecdf[[ps]]))}))

  ## On most chips, PM and MM probe are next to each other on the chip, at 
same
  ## x coordinate and at adjacent y coordinates. Then, "sizex" is always the 
same,
  ## namely the size of the chip in x-direction. On some chips, there are 
few
  ## exceptions.
##tab = table(mm1-pm1)

tab = table(pm1)

## WHAT TO DO FOR THE ABOVE LINE?
  sizex = as.numeric(names(tab))[ max(tab)==tab ]

  ## The probe indices according to pt
  pm2   =  pt$y    * sizex + pt$x + 1
## Crossed out the mm2 function

## mm2   = (pt$y+1) * sizex + pt$x + 1
  psnm2 = pt[["Probe.Set.Name"]]

  ## Check if the probe set names that are associated with each probe
  ## are the same in both CDF and pt
 ## z1 = z2 = rep(NA, max(pm1, mm1, pm2, mm2))
 ## z1[pm1] = z1[mm1] = psnm1
 ## z2[pm2] = z2[mm2] = psnm2

## Remove mm1 and mm3 referals

z1 = z2 = rep(NA, max(pm1, pm2))
z1[pm1] = psnm1
z2[pm2] = psnm2



More information about the Bioconductor mailing list