[BioC] Memory limit (vector size) o linux 64bit

Fri Feb 9 15:47:03 CET 2007

Quoting "James W. MacDonald" <jmacdon at med.umich.edu>:

> Hi Ivan,
>
> Ivan Porro wrote:
> > Hi all,
> >
> > I'm running a script that try to normalise 448 HGU133A Affymetrix arrays,
> and I
> > have "The Error" during the ReadAffy():
> >
> >   Error: cannot allocate vector of size 1770343 Kb
> >
> > I know about R and OSs adressing limitations, so (according to several
> posts on
> > the mailing list) I'm doing that on a  64bit server:
> >
> > x86_64 GNU/Linux (2x AMD Opteron 275)
> > R 2.3.1 compiled from source
> > MemTotal:         7819808 kB
> > VmallocTotal: 34359738367 kB
> >
> > that is, 8GB RAM (the difference is probably kept by a on board video) and
> up to
> > 34TB swap.
> >
> > I know from R FAQ that  "There are also limits on individual objects. On
> all
> > versions of R, the maximum length (number of elements) of a vector is 2^31
> - 1
> > ~ 2*10^9"
> >
> > But: 2.147.483.648  = 2^31  is bigger than 1.770.343.000 bytes (my vector
> size)
>
> I think you are comparing apples and oranges here. The number of
> elements in a vector is different than the amount of RAM that the vector
> occupies.

Thank you. I've missed this point, probably due to the poor knowledge of R
internals I have.

> >
> > I'm near (above) R physical limitations?
> >
> > I use batch<-ReadAffy() then I should normalize it with gcrma(),
> invariantset
>
> With that number of arrays I don't think you have enough RAM. The error
> you see doesn't mean that there is only one thing of 1.7 Gb being
> allocated. What it means is that R is trying to allocate 1.7 Gb for a
> vector, and you have less than that available (the rest has already been
> allocated).
>
> If you want to do gcrma(), you will probably be able to use justGCRMA(),
> which skips the AffyBatch altogether. If you want to do gcrma() with an
> invariantset normalization you might be able to hack just.gcrma() enough
> to use a modified normalize.AffyBatch.invariantset. Personally, I don't
> think it would be worth the effort, but if you really want an
> invariantset normalization, that would probably be your best bet.

I'will try your approach and if it is too complex for me, I'll proceed to
normalise data using dChip we successfully ported to linux/MPI, that already
worked fine with this amount of data. However, I'm worried about
differences on results between R/Bioconductor and dChip (but this is another
topic)

> HTH,
> Jim

Yes, definitely. Thank you.  I'll ask for a 16GB machine or ask users to
evaluate R/dChip differences: hope they'll choose the cheapest

thank you,

ivan

>
>
>
> >
> > thank you in advance,
> >
> >     Ivan
> >
> >
> >
> >
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be
> used for urgent or sensitive issues.
>

-- 
http://www.bio.dist.unige.it
voice: +39 010 3532789
fax:   +39 010 3532948