[BioC] Maximum number of CEL files for ReadAffy() in Affy package.

Henrik Bengtsson hb at stat.berkeley.edu
Wed Jul 23 02:02:21 CEST 2008


On Tue, Jul 22, 2008 at 4:04 PM, Hailong Cui <hcui1 at asu.edu> wrote:
> Dear all,
>
> First, I apologize for the mass email. I've been reading manuals, googling,
> searching the archive of the mailing list, but still cannot find an exact
> answer to my problem.
>
> (1) Question: Can a large number of CEL files cause an overflow for the
> function ReadAffy() in the affy packages? Is there any way to fix this?
> Other options seem to be other software RMAExpress and dChip in WindowsXP.
> Any suggestions?

The aroma.affymetrix package
[http://www.braju.com/R/aroma.affymetrix/] can handle very large data
sets.  It works for most Affymetrix chip types.  The memory overhead
is constant so there is basically no limit in the number of arrays you
can process, e.g. I know people have successfully process 4,500+
HG-U133A CEL files using it.

/Henrik

>
> (2) Background: What I am trying to do is to read in all the CEL files in
> the directory to create an AffyBatch object, so that I can use functions in
> the affy package. To be more specific, I want to do RMA, dChip normalization
> and get MAplots. In my workstation (48 64-bit CPUs, 500Gb memory),
> ReadAffy() worked fine for 241 CEL files, but when I moved on to 2,035 CEL
> files, it failed and kept showing the error message below. The number of
> rows for the CEL file is roughly 50k. On the bright side, I tried justRMA()
> and got the expression values in the text format.
>
>> R
>> library(affy)
>> Data <- ReadAffy()
> Error in read.affybatch(filenames = l$filenames, phenoData
> = l$phenoData,  :
>  allocMatrix: too many elements specified
>
>
> FYI, below is the session information on my workstation.
>
>> sessionInfo()
> R version 2.7.1 (2008-06-23)
> ia64-unknown-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
>  [1] geneplotter_1.18.0          annotate_1.18.0
>  [3] xtable_1.5-2                AnnotationDbi_1.2.2
>  [5] RSQLite_0.6-9               DBI_0.2-4
>  [7] lattice_0.17-8              BufferedMatrixMethods_1.4.0
>  [9] BufferedMatrix_1.4.0        affy_1.18.2
> [11] preprocessCore_1.2.0        affyio_1.8.0
> [13] Biobase_2.0.1
>
> loaded via a namespace (and not attached):
> [1] grid_2.7.1         KernSmooth_2.22-22 RColorBrewer_1.0-2
>
>
>
>
> Thank you so much for reading this and I would appreciate your reply.
>
> Hailong
>
>
> --
> Sincerely,
>
> Hailong Cui
>
> Computational Biosciences PSM Program
> Graduate Certificate in Statistics Program
> Web Page: http://mathpost.asu.edu/~hcui
>
> Graduate Teaching Associate (Instructor)
> Department of Mathematics & Statistics
> Arizona State University
> Tempe, AZ 85287-1804
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list