[R] Memory usage reported by gc() differs from 'top'

Martin Morgan mtmorgan at fhcrc.org
Thu Apr 18 15:39:14 CEST 2013


On 04/18/2013 03:18 AM, Milan Bouchet-Valat wrote:
> Le mercredi 17 avril 2013 à 23:17 -0400, Christian Brechbühler a écrit :
>> In help(gc) I read, "...the primary purpose of calling 'gc' is for the
>> report on memory usage".
>> What memory usage does gc() report?  And more importantly, which memory
>> uses does it NOT report?  Because I see one answer from gc():
>>
>>              used  (Mb) gc trigger   (Mb) max used  (Mb)
>> Ncells 14875922 794.5   21754962 1161.9 17854776 953.6
>> Vcells 59905567 457.1   84428913  644.2 72715009 554.8

 From the R side of things, this is an (approximate) accounting of memory 
actually reached by objects in the current session. One possible reason for 
discrepancy with the OS is that you are using a package that references memory R 
does not know about (e.g., 'external pointers'), or there is a memory leak in R 
or a third party package where memory is not returned to the OS. Even if the 
reason is 'memory fragmentation' as suggested by Milan, it is interesting to 
understand how that fragmentation arises, either to identify a work-around or 
more productively to understand and address the underlying problem.

So a reasonable avenue is to develop a minimal, reproducible example of how one 
could arrive at the situation you report.

Martin

>> (That's about 1.5g max used, 1.8g trigger.)
>> And a different answer from an OS utility, 'top':
>>
>>     PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>
>>    6210 brech     20   0 18.2g 7.2g 2612 S    1 93.4  16:26.73 R
>>
>> So the R process is holding on to 18.2g memory, but it only seems to have
>> accout of 1.5g or so.
>> Where is the rest?
>>
>> I tried searching the archives, and found answers like "just buy more RAM".
>>    Which doesn't exactly answer my question.  And come on, 18g is pretty big;
>> sure it doesn't fit in my RAM (only 7.2g are in), but that's beside the
>> point.
>>
>> The huge memory demand is specific to R version 2.15.3 Patched (2013-03-13
>> r62500) -- "Security Blanket".  The same test runs without issues under R
>> version 2.15.1 beta (2012-06-11 r59557) -- "Roasted Marshmallows".
>>
>> I appreciate any insights you can share into R's memory management, and
>> gc() in particular.
>> /Christian
> First, completely stop looking at virtual memory: it does not mean much, if
> anything. What you care about is resident memory. See e.g.:
> http://serverfault.com/questions/138427/top-what-does-virtual-memory-size-mean-linux-ubuntu
>
> Then, there is a limitation with R/Linux: gc() does not reorder objects in memory
> so that they are all on the same area. This means that while the total size of
> R objects in memory is 457MB, they are spread all over the RAM, and a single
> object in a memory page forces the Linux kernel to keep it in RAM.
>
> I do not know the exact details, as it seems that Windows does a better
> job than Linux in that regard. One workaround is to save the session and
> restart R: objects will be loaded in a more compact fashion.
>
> As for the differences between R 2.15.1 and R 2.15.3, maybe there is some
> more copying that increases memory fragmentation, but the fundamental
> problem has not changed AFAIK. You can call tracemem() on large objects
> to see how many times they are being copied. See
> http://developer.r-project.org/memory-profiling.html
>
>
> My two cents
>
>
>>      [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the R-help mailing list