[R] Memory Problems in R
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Aug 18 20:45:54 CEST 2004
On Wed, 18 Aug 2004, Roger D. Peng wrote:
> There is a limit on how long a single vector can be, and I think it's
> 2GB (even on 64-bit platforms). Not sure on how the gc trigger is set....
There is a limit of R_SIZE_T_MAX bytes, but that is defined as ULONG_MAX
which should be 4GB-1 on a 32-bit platform, and much more on a 64-bit
platform.
The example works on a 64-bit platform, which demonstrates that there is
no 2GB limit there.
If you hit the length limit, the message is of the form
cannot allocate vector of length ...
Looking at the code in memory.c it seems that
if (size >= (LONG_MAX / sizeof(VECREC)) - sizeof(SEXPREC_ALIGN) ||
(s = malloc(sizeof(SEXPREC_ALIGN) + size * sizeof(VECREC)))
== NULL) {
/* reset the vector heap limit */
R_VSize = old_R_VSize;
errorcall(R_NilValue, "cannot allocate vector of size %lu Kb",
(size * sizeof(VECREC))/1024);
}
has a limit of LONG_MAX bytes for a vector. I think that is
unintentional, and you might like to try ULONG_MAX there and re-compile.
But it really doesn't make much difference as there is very little you can
do with an object taking up more than half the maximum memory size
except access bits of it (and that is what DBMSes are for).
A few comments:
1) Of course R does have objects in memory, 12.5Mb of them according to
gc. You are not starting with a clean slate. Hopefully malloc has
allocated them in a compact group.
2) Solaris has been a 64-bit OS for at least 7 years and you really should
be using a 64-bit build of R if you plan on exceeding 1Gb.
3) To create a matrix efficiently, create a vector and assign a dim. I
gave an example on R-help yesterday, so please check the archives.
matrix() makes a copy of the data and so needs double the space you are
thinking it does. Take a look at the source code:
PROTECT(snr = allocMatrix(TYPEOF(vals), nr, nc));
if(lendat) {
if (isVector(vals))
copyMatrix(snr, vals, byrow);
else
copyListMatrix(snr, vals, byrow);
4) The source code is the documentation here. I suspect no one person
knows all the details.
> Scott Gilpin wrote:
> > Hello everyone -
> >
> > I have a couple of questions about memory management of large objects.
> > Thanks in advance for your response.
> >
> > I'm running R version 1.9.1 on solaris 8, compiled as a 32 bit app.
> > My system has 12.0 GB of memory, with usually ~ 11GB free. I checked
> > system limits using ulimit, and there is nothing set that would limit
> > the maximum amount of memory for a process (with the exception of an
> > 8MB stack size). I've also checked the amount of memory available to
> > R using mem.limits(), and there is no limit set.
> >
> > I'm running into two problems. The first is the error "cannot
> > allocate vector of size XXXXX" - I know this has been discussed
> > several times on this mailing list, but it usually seems the user does
> > not have enough memory on their system, or does not have the memory
> > limits set correctly. I don't believe this is the case in this
> > situation. I verified that I don't have any objects in memory when R
> > starts up, and that memory limits are set to NA. Here is some output:
> >
> >
> >>ls()
> >
> > character(0)
> >
> >>mem.limits()
> >
> > nsize vsize
> > NA NA
> >
> >>gc()
> >
> > used (Mb) gc trigger (Mb)
> > Ncells 432197 11.6 531268 14.2
> > Vcells 116586 0.9 786432 6.0
> >
> >>v<-rep(0,268435431)
> >
> > Error: cannot allocate vector of size 2097151 Kb
> >
> >>v<-rep(0,268435430)
> >>object.size(v)
> >
> > [1] 2147483468
> >
> >>gc()
> >
> > used (Mb) gc trigger (Mb)
> > Ncells 432214 11.6 741108 19.8
> > Vcells 268552029 2048.9 268939773 2051.9
> >
> >
> > Does R have a limit set on the size of an object that it will
> > allocate? I know that the entire application will only be able to use
> > 4GB of memory (because it's only 32bit), but I haven't found anything
> > in the R documentation or the help lists that indicates there is
> > maximum on the size of an object. I understand there will be problems
> > if an object is greater than 2GB and needs to be copied - but will R
> > limit the creation of such an object? It's also my understanding that
> > the garbage collector won't move objects and this may cause memory to
> > become fragmented - but I'm seeing these issues on startup when there
> > are no objects in memory.
> >
> >
> > My second problem is with matrices and the garbage collector, and the
> > limits it sets for gc trigger after a matrix is created. When I
> > create a vector of approximately 500MB, R sets the gc trigger to be
> > slightly above this amount. The gc trigger also seems to correspond
> > to the process size (as output by top). When I create a matrix of
> > approximately 500MB, R sets the gc trigger to be roughly 3 times the
> > size of the matrix (and the process size is ~ 1.5GB). Therefor, when
> > I try to create larger matrices, where 3x the size of the matrix is
> > greater than 4GB, R gives me an error. Is there anything I can do to
> > create large matrices? Or do I have to manipulate large objects as a
> > vector?
> >
> > Output from the 3 different scenarios is below:
> >
> > 1) - can't create a matrix, but can create a vector
> >
> > [Previously saved workspace restored]
> >
> >
> >>m<-matrix(rep(0,25000*10000),nrow=10000)
> >
> > Error: cannot allocate vector of size 1953125 Kb
> >
> >>v<-rep(0,25000*10000)
> >>object.size(v)/1024
> >
> > [1] 1953125
> >
> >
> > 2) gc trigger is set slightly higher than the size of the vector
> >
> >
> >>ls()
> >
> > character(0)
> >
> >>mem.limits()
> >
> > nsize vsize
> > NA NA
> >
> >>gc()
> >
> > used (Mb) gc trigger (Mb)
> > Ncells 432197 11.6 531268 14.2
> > Vcells 116586 0.9 786432 6.0
> >
> >>v<-rep(0,(2510)*(25000))
> >>object.size(v)
> >
> > [1] 5.02e+08
> >
> >>gc()
> >
> > used (Mb) gc trigger (Mb)
> > Ncells 432210 11.6 667722 17.9
> > Vcells 62866589 479.7 63247172 482.6
> >
> >
> > 3) gc trigger is set ~ 3x the size of the matrix
> >
> >
> >>ls()
> >
> > character(0)
> >
> >>mem.limits()
> >
> > nsize vsize
> > NA NA
> >
> >>gc()
> >
> > used (Mb) gc trigger (Mb)
> > Ncells 432197 11.6 531268 14.2
> > Vcells 116586 0.9 786432 6.0
> >
> >>A<-matrix(rep(0,(2510)*(25000)),nrow=(2510),ncol=(25000))
> >>object.size(A)
> >
> > [1] 502000120
> >
> >>gc()
> >
> > used (Mb) gc trigger (Mb)
> > Ncells 432213 11.6 741108 19.8
> > Vcells 62866590 479.7 188640940 1439.3
> >
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list