[BioC] Memory issue in 500k chip analysis

Simon Lin simonlin at duke.edu
Thu Jun 7 17:20:09 CEST 2007

We are experiencing a similar memory problem with Illumina SNP chips. We are 
looking for a general solution using databases to handel very large data 
sets. Does that make sense? -Simon

Date: Thu, 7 Jun 2007 11:43:51 +0200
From: Olivier Nu?ez <nunez at est-econ.uc3m.es>
Subject: [BioC] Memory issue in 500k chip analysis
To: Benilton Carvalho <bcarvalh at jhsph.edu>
Cc: bioconductor at stat.math.ethz.ch
Message-ID: <CFDBA356-5927-4F2B-93AE-47DD2F195C8A at est-econ.uc3m.es>
Content-Type: text/plain

Dear Benilton,

an obvious and possibly naive way to avoid the common memory issues
in the analysis of 500K chips with oligo,
would be to select a sample of SNPs (likely under a stratified random
design) from the array before using snprma.
Therefore, my first question is whether or not such a sampling method
is feasible under oligo.
The "affxparser" library allows to select subset of SNPs but I ignore
how oligo-package deals with values of a command like readCelUnits
My second question is about the reliability of such a method:
The linear fit you use to perform the normalization is based on the
whole set of SNPs and the information from the mapping package.
Do you think the preprocessing oligo method remains valid (at least
for the selected sample) if it is based only on a sample of SNPs.
Thanks for your help.

Best. Olivier


Olivier G. Nuñez
nunez at est-econ.uc3m.es
Universidad Carlos III de Madrid [ Tel : +34 663 03 69 09 ]
Departamento de Estadistica [ Fax : +34 91 624 98 49 ]
Facultad de Ciencias Sociales
C/ Madrid 126 GETAFE 28903 SPAIN
Web Page: http://www.est-econ.uc3m.es/onunez

