Warnes, Gregory R gregory_r_warnes at groton.pfizer.com
Thu Dec 11 16:28:12 MET 2003

> From: rossini at blindglobe.net [mailto:rossini at blindglobe.net]
> "Michael Benjamin" <msb1129 at bellsouth.net> writes:
> > I'd like to analyze these chips in a reasonable amount of 
> time, without
> > paying Dell $45,000 for 4-Xeon SMP server.
> >
> > I worry what we'll do with 1,000 .CEL files.  The 
> analytical techniques
> > work well, but pretty slow even if your amp "goes to 11."
> >
> > Any thoughts?
> Explicitly parallelize the routine.  OpenMOSIX is nice, but it's still
> not a production environment with R.  

I've done some work to parallelize some things here at Pfizer.  At the
moment, I've concentrated on the step of applying a statistical model to all
of the genes and have code that parallelizes this process using RPVM + SNOW
+ a custom parallel 'apply' function.  I get a speedup that looks perfectly
linear for this step.

As for reading in and normalizing the chips,  I would suggest using RPVM +
SNOW to spread out the reading-in of the cel files (which in my experience
is the most time consuming step), then combine the results into a single
object, which you can then normalize and scale.  The normalizing and scaling
can, of course also be split up across processors.   

At one point I had preliminary code to do this, but that was a year ago and
the affy code has changed quite a bit since then.


