[BioC] ReadAffy gives Error

Ben Bolstad bmb at bmbolstad.com
Thu Apr 5 14:47:56 CEST 2007


The best situation would be if you got additional memory (or even
additional swap space would be good). The part of the function that hogs
memory is where it actually instantiates the S4 AffyBatch object (at
this point it has actually already read all the CEL file intensity data
into a matrix).

As an alternative (and I hesitate to promote this on the BioC mailing
list) you might try looking at RMAExpress:

http://rmaexpress.bmbolstad.com

The most recent (beta) version now includes the PLM (aka BioC's
affyPLM/fitPLM) based methodology including NUSE, RLE and chip
pseudo-images. I've personally tested it on a dataset with 350 HGU133A
plus 2  arrays on a Linux machine with 3GB RAM, 6 Swap with no problems
(and in a benchmarking experiment many eons ago pushed it to 3000 of the
old HGU133A arrays). Note that on a Linux machine you'd have to build it
yourself (binaries ony supplied for Windows).

If you are just interested in the more traditional QC metrics ala those
in simpleaffy (eg percent present etc). Then I believe those are all
computed in a single chip manner (and someone who uses these more often
can correct me if I am wrong), so you could read your data in using
smaller batches in this situation.

Best,

Ben

On Wed, 2007-04-04 at 16:15 -0400, Boel Brynedal wrote:
> Dear all,
> 
> How much RAM is needed to read and analyze 88 hgu133plus2 arrays?
> As I've understood it, the actual ReadAffy() part would not be a
> problem, but the normalization. In this case I want to do all of the
> quality controls, I want the AffyBatches.
> I had the impression that 4GB would be enough.
> 
> Best,
> Boel
> 
> On Wed, 2007-04-04 at 09:50 -0400, James W. MacDonald wrote:
> > Boel Brynedal wrote:
> > >>>Error: cannot allocate vector of size 931491 Kb
> > >>
> > >>This error indicates that you need more RAM.
> > > 
> > > 
> > > But I have 4GB of RAM, shouldn't that be enough? 
> > 
> > Depends on what kind of chip you are using. It might work for older 
> > chips (e.g., hgu95av2), but probably not for the current generation of 
> > 3' arrays (e.g., hgu133plus2).
> > 
> > > Is there a limitation for how much memory R can use? And, if there is,
> > > how can I change this?
> > 
> > There are limits on the size of objects, but you will not be hitting 
> > that here. On Linux R will take all the memory it requires without any 
> > intervention by you, so if you are getting this error you have hit the 
> > wall. Are you doing other memory-hungry things concurrently?
> > 
> > There are ways around this that don't require purchasing RAM. First, you 
> > can use justRMA() which will undoubtedly be able to process all your 
> > chips. The downside is no AffyBatch, so you can't do QA plots of the raw 
> > data.
> > 
> > Another alternative is to use read.probematrix(), which will read in 
> > just the PM and/or MM probes. You can use these data for quality 
> > assessment, etc, but you will be missing all the niceties that come with 
> > using an AffyBatch.
> > 
> > > 
> > > 
> > >>>Error in isVersioned(object) : error in evaluating the argument 'object'
> > >>>in selecting a method for function 'isVersioned'
> > >>
> > >>Not sure about this one. It may just be an artifact of the first error, 
> > >>or indicate a mismatch in your package versions. How did you install the 
> > >>BioC packages? What is your sessionInfo()?
> > > 
> > > 
> > > Bioconductor was installed using biocLite(), other packages where also
> > > downloaded and installed (using i.e. R CMD INSTALL simpleaffy).
> > 
> > You should use biocLite() for all package installation. If you just grab 
> > things and install directly you always run the risk that you are 
> > installing something that is an incorrect version for the version of 
> > R/BioC that you have. Using biocLite() ensures that you get the correct 
> > thing.
> > 
> > For instance, simpleaffy 2.4.2 is not the correct version for use with 
> > BioC 1.9. You should have 2.8.0. This doesn't explain the isVersioned 
> > error, as your affy/Biobase/affyio are all correct versions. It is 
> > probably just because you ran out of memory.
> > 
> > Best,
> > 
> > Jim
> > 
> > 
> > 
> > 
> > > 
> > > This is my sessionInfo()
> > > 
> > >>sessionInfo()
> > > 
> > > R version 2.4.1 (2006-12-18)
> > > x86_64-unknown-linux-gnu
> > > 
> > > locale:
> > > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
> > > 
> > > attached base packages:
> > >  [1] "grid"      "splines"   "tools"     "stats"     "graphics"
> > > "grDevices"
> > >  [7] "utils"     "datasets"  "methods"   "base"
> > > 
> > > other attached packages:
> > >  simpleaffy  genefilter    survival     IDPmisc     lattice     affyPLM
> > >     "2.4.2"    "1.12.0"      "2.30"     "0.9.1"   "0.14-16"    "1.10.0"
> > >       gcrma matchprobes    affydata        affy      affyio     Biobase
> > >     "2.6.0"     "1.6.0"    "1.10.0"    "1.12.2"     "1.2.0"    "1.12.2"
> > > 
> > > I can read 4 CEL files without any problems, so maybe this is a memory
> > > problem all together, but I really thought 4 GB of RAM would be enough.
> > > 
> > > Thankful for any advice,
> > > Boel
> > > 
> > >>Best,
> > >>
> > >>Jim
> > > 
> > > 
> > >>>Any suggestions to what is wrong? 
> > >>>As you might imagine, I am quite new in this field. 
> > >>>
> > >>>Best regards,
> > >>>Boel Brynedal, PhD student, Karolinska Institutet, Sweden.
> > >>>
> > >>>_______________________________________________
> > >>>Bioconductor mailing list
> > >>>Bioconductor at stat.math.ethz.ch
> > >>>https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >>>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> > >>
> > >>
> > > 
> > 
> >
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list