[BioC] gcrma

Wed Nov 26 14:02:00 MET 2003

Dear Dr. Irizarry

At the CHI meeting in Baltimore I enjoyed your interesting talk about gcrma,

which currently seems to be the best algorithm for condesation/normalization

of CEL files, as the affycomp results suggest. For this reason my colleagues

and I were eager to test gcrma with our own datasets containing between 20 
and 130 HGU133 chips.

My colleague tested rma and gcrma with the following setting:
HP xw8000 Dual Xeon 2.8 GHz with 2 GB RAM
RedHat 8.0 with kernel 2.4.18-SMP
R-1.8.1 and Bioconductor 1.3 compiled by us for above settings

Using this setup, rma can process the data really fast:
21 HGU133A data: about 1 minute
40 HGU133A data: about 1.5 minutes
130 HGU133A data: about 2.5 minutes

For gcrma we got the following results:
21 HGU133A data: about 45 minutes using 500 MB RAM
40 HGU133A data: about 90 minutes using 900 MB RAM
130 HGU133A data: the usual error: cannot allocate vector of size blabla

Since we will soon switch to the new Affymetrix HG-U133_Plus_2
GeneChips, things will getting worse.

My questions are the following:
Do you intend to optimize the behavior of gcrma, e.g. by rewriting it in C?
In the meantime, which setup would be sufficient for gcrma to handle
130 HGU133A data? Do you think that a 64bit processor machine would
be helpful? Could the Dual G5 Mac be an option?

Thank you in advance for your help.
P.S. Please reply also to me since I am not subscribed to the mailing list.

Best regards
Christian Stratowa

==============================================
Christian Stratowa, PhD
Boehringer Ingelheim Austria
Dept NCE Lead Discovery - Bioinformatics
Dr. Boehringergasse 5-11
A-1121 Vienna, Austria
Tel.: ++43-1-80105-2470
Fax: ++43-1-80105-2366
email: christian.stratowa at vie.boehringer-ingelheim.com