[Rd] Need for garbage collection after creating object

Henrik Bengtsson hb at stat.berkeley.edu
Tue Feb 5 19:12:52 CET 2008


On Feb 5, 2008 8:01 AM, Iago Mosqueira <iago.mosqueira at gmail.com> wrote:
> Hello,
>
> After experiencing some difficulties with large arrays, I was surprised
> to see the apparent need for class to gc() after creating fairly large
> arrays. For example, calling
>
> a<-array(2, dim=c(10,10,10,10,10,100))
>
> makes the memory usage of a fresh session of R jump from 13.8 Mb to
> 166.4 Mb. A call to gc() brought it down to 90.8 Mb,
>
>  > gc()
>             used (Mb) gc trigger  (Mb) max used  (Mb)
> Ncells   132619  3.6     350000   9.4   350000   9.4
> Vcells 10086440 77.0   21335887 162.8 20086792 153.3
>
> as expected by
>
>  > object.size(a)
>
> [1] 80000136

I think the reason for this is that array() has to "expand" the input
data to the right length internally;

 data <- rep(data, length.out = vl)

That is a so called "NAMED" object internally and when the following call to

  dim(data) <- dim

occurs, the safest thing R can do is to create a copy. [Anyone,
correct me if I'm wrong].

If you expand the input data yourself, you won't see that extra copy, e.g.

  data <- 2
  dim <- c(10,10,10,10,10,100)
  data <- rep(data, length.out=prod(dim))
  a <- array(data, dim=dim)

>
> Do I need to call gc() after creating every large array, or can I setup
> the system to do this more often or efficiently?

The R garbage collector will free/deallocate that memory when
"needed".  However, calling gc() explicitly should minimize the risk
for over-fragmented memory.  Basically, if there are several blocks of
garbage memory hanging around, you might end up with a situation where
you a lot of *total* memory available, but you will only be able to
allocate small chunks of memory at any time.  Even calling gc() at
that situation will not help; there is no mechanism that defragments
memory in R.  So calling gc() after large allocations will add some
protection against that.

/Henrik

>
> Thanks very much,
>
>
> Iago
>
>
> $platform
> [1] "i686-pc-linux-gnu"
> $version.string
> [1] "R version 2.6.1 (2007-11-26)"
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list