[R] Memory usage reported by gc() differs from 'top'

Milan Bouchet-Valat nalimilan at club-internet.fr
Thu Apr 18 12:18:03 CEST 2013


Le mercredi 17 avril 2013 à 23:17 -0400, Christian Brechbühler a écrit :
> In help(gc) I read, "...the primary purpose of calling 'gc' is for the
> report on memory usage".
> What memory usage does gc() report?  And more importantly, which memory
> uses does it NOT report?  Because I see one answer from gc():
> 
>            used  (Mb) gc trigger   (Mb) max used  (Mb)
> Ncells 14875922 794.5   21754962 1161.9 17854776 953.6
> Vcells 59905567 457.1   84428913  644.2 72715009 554.8
> 
> (That's about 1.5g max used, 1.8g trigger.)
> And a different answer from an OS utility, 'top':
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 
>  6210 brech     20   0 18.2g 7.2g 2612 S    1 93.4  16:26.73 R
> 
> So the R process is holding on to 18.2g memory, but it only seems to have
> accout of 1.5g or so.
> Where is the rest?
> 
> I tried searching the archives, and found answers like "just buy more RAM".
>  Which doesn't exactly answer my question.  And come on, 18g is pretty big;
> sure it doesn't fit in my RAM (only 7.2g are in), but that's beside the
> point.
> 
> The huge memory demand is specific to R version 2.15.3 Patched (2013-03-13
> r62500) -- "Security Blanket".  The same test runs without issues under R
> version 2.15.1 beta (2012-06-11 r59557) -- "Roasted Marshmallows".
> 
> I appreciate any insights you can share into R's memory management, and
> gc() in particular.
> /Christian
First, completely stop looking at virtual memory: it does not mean much, if
anything. What you care about is resident memory. See e.g.:
http://serverfault.com/questions/138427/top-what-does-virtual-memory-size-mean-linux-ubuntu

Then, there is a limitation with R/Linux: gc() does not reorder objects in memory
so that they are all on the same area. This means that while the total size of
R objects in memory is 457MB, they are spread all over the RAM, and a single
object in a memory page forces the Linux kernel to keep it in RAM.

I do not know the exact details, as it seems that Windows does a better
job than Linux in that regard. One workaround is to save the session and
restart R: objects will be loaded in a more compact fashion.

As for the differences between R 2.15.1 and R 2.15.3, maybe there is some
more copying that increases memory fragmentation, but the fundamental
problem has not changed AFAIK. You can call tracemem() on large objects
to see how many times they are being copied. See
http://developer.r-project.org/memory-profiling.html


My two cents


>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list