[R] large data set, error: cannot allocate vector

Thomas Lumley tlumley at u.washington.edu
Mon May 8 16:47:20 CEST 2006


On Fri, 5 May 2006, Robert Citek wrote:

>
> On May 5, 2006, at 11:30 AM, Thomas Lumley wrote:
>> In addition to Uwe's message it is worth pointing out that gc()
>> reports
>> the maximum memory that your program has used (the rightmost two
>> columns).
>> You will probably see that this is large.
>
> Reloading the 10 MM dataset:
>
> R > foo <- read.delim("dataset.010MM.txt")
>
> R > object.size(foo)
> [1] 440000376
>
> R > gc()
>            used  (Mb) gc trigger  (Mb) max used  (Mb)
> Ncells 10183941 272.0   15023450 401.2 10194267 272.3
> Vcells 20073146 153.2   53554505 408.6 50086180 382.2
>
> Combined, Ncells or Vcells appear to take up about 700 MB of RAM,
> which is about 25% of the 3 GB available under Linux on 32-bit
> architecture.  Also, removing foo seemed to free up "used" memory,
> but didn't change the "max used":

No, that's what "max" means.  You need gc(reset=TRUE) to reset the max.

 	-thomas


Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle




More information about the R-help mailing list