[R] Error: cannot allocate vector of size xxx Mb

Petr PIKAL petr.pikal at precheza.cz
Thu Aug 5 12:17:16 CEST 2010


Hi

I am not an expert in such issues (never really run into problems with 
memory size).
 
>From what I have read in previous posts on this topic (and there are 
numerous) the simplest way would be to go to 64 byte system (Linux, W 
Vista, 7), where size of objects is limited by amount of memory only.

There are some packages dealing with big data (biglm, ...) or database 
approach (sqldf)

Your version is a bit obsolete so upgrading could help but not with your 
final operation.

Sometimes it can help to rethink  why do you need such a huge amount of 
data together in memory and if you can not use only sampled data  for 
further study.

Regards
Petr

Ralf B <ralf.bierig at gmail.com> napsal dne 05.08.2010 11:13:40:

> Thank you for such a careful and thorough analysis of the problem and
> your comparison with your configuration. I very much appreciate.
> For completeness and (perhaps) further comparison, I have executed
> 'version' and sessionInfo() as well:
> 
> 
> > version
>                _
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status         RC
> major          2
> minor          10.0
> year           2009
> month          10
> day            25
> svn rev        50206
> language       R
> version.string R version 2.10.0 RC (2009-10-25 r50206)
> > sessionInfo()
> R version 2.10.0 RC (2009-10-25 r50206)
> i386-pc-mingw32
> 
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> 
> attached base packages:
>  [1] splines   stats4    grid      stats     graphics  grDevices utils
>  [8] datasets  methods   base
> 
> other attached packages:
>  [1] flexmix_2.2-7     multcomp_1.1-7    survival_2.35-8   mvtnorm_0.9-9
>  [5] modeltools_0.2-16 lattice_0.18-3    car_1.2-16        psych_1.0-88
>  [9] nortest_1.0       gplots_2.8.0      caTools_1.10 bitops_1.0-4.1
> [13] gdata_2.8.0       gtools_2.6.2      ggplot2_0.8.7     digest_0.4.2
> [17] reshape_0.8.3     plyr_0.1.9        proto_0.3-8       RJDBC_0.1-5
> [21] rJava_0.8-2       DBI_0.2-5
> 
> loaded via a namespace (and not attached):
> [1] tools_2.10.0
> 
> > memory.limit()
> [1] 2047
> 
> 
> 
> Also, the example i presented was a simplified reproduction of the
> real data structure. My real data structure does not have reused
> vectors. I merely wanted to show the error occurring when processing
> large vectors into data frames and then binding these data frames
> together. I hope this additional information helps. I might add that I
> am running this in StatET under Eclipse with 512 MB of allocated RAM
> in the environment.
> 
> Besides adding more memory, can you spot simple ways of how memory use
> can be improved? I know that I am running quite a bit of baggage.
> Unfortunately my script is rather comprehensive and my example is
> really just a simplified part that I created to reproduce the problem.
> 
> Thanks,
> Ralf
> 
> 
> 
> 
> 
> 
> On Thu, Aug 5, 2010 at 4:44 AM, Petr PIKAL <petr.pikal at precheza.cz> 
wrote:
> > Hi
> >
> > r-help-bounces at r-project.org napsal dne 05.08.2010 09:53:21:
> >
> >> I am dealing with very large data frames, artificially created with
> >> the following code, that are combined using rbind.
> >>
> >>
> >> a <- rnorm(5000000)
> >> b <- rnorm(5000000)
> >> c <- rnorm(5000000)
> >> d <- rnorm(5000000)
> >> first <- data.frame(one=a, two=b, three=c, four=d)
> >> second <- data.frame(one=d, two=c, three=b, four=a)
> >
> > Up to this point there is no error on my system
> >
> >> version
> >               _
> > platform       i386-pc-mingw32
> > arch           i386
> > os             mingw32
> > system         i386, mingw32
> > status         Under development (unstable)
> > major          2
> > minor          12.0
> > year           2010
> > month          05
> > day            31
> > svn rev        52164
> > language       R
> > version.string R version 2.12.0 Under development (unstable) 
(2010-05-31
> > r52164)
> >
> >> sessionInfo()
> > R version 2.12.0 Under development (unstable) (2010-05-31 r52164)
> > Platform: i386-pc-mingw32/i386 (32-bit)
> >
> > attached base packages:
> > [1] stats     grDevices datasets  utils     graphics  methods   base
> >
> > other attached packages:
> > [1] lattice_0.18-8 fun_1.0
> >
> > loaded via a namespace (and not attached):
> > [1] grid_2.12.0  tools_2.12.0
> >
> >> rbind(first, second)
> >
> > Although size of first and second is only roughly 160 MB their
> > concatenation probably consumes all remaining memory space as you 
already
> > have a-d first and second in memory.
> >
> > Regards
> > Petr
> >
> >>
> >> which results in the following error for each of the statements:
> >>
> >> > a <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > b <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > c <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > d <- rnorm(5000000)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > first <- data.frame(one=a, two=b, three=c, four=d)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > second <- data.frame(one=d, two=c, three=b, four=a)
> >> Error: cannot allocate vector of size 38.1 Mb
> >> > rbind(first, second)
> >>
> >> When running memory.limit() I am getting this:
> >>
> >> memory.limit()
> >> [1] 2047
> >>
> >> Which shows me that I have 2 GB of memory available. What is wrong?
> >> Shouldn't 38 MB be very feasible?
> >>
> >> Best,
> >> Ralf
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >



More information about the R-help mailing list