[Rd] Interpreting R memory profiling statistics from Rprof() and gc()

Tomas Kalibera tomas.kalibera at gmail.com
Mon May 29 15:09:19 CEST 2017


On 05/18/2017 06:54 PM, Joy wrote:
> Sorry, this might be a really basic question, but I'm trying to interpret
> the results from memory profiling, and I have a few questions (marked by
> *Q#*).
>
>  From the summaryRprof() documentation, it seems that the four columns of
> statistics that are reported when setting memory.profiling=TRUE are
> - vector memory in small blocks on the R heap
> - vector memory in large blocks (from malloc)
> - memory in nodes on the R heap
> - number of calls to the internal function duplicate in the time interval
> (*Q1:* Are the units of the first 3 stats in bytes?)
In Rprof.out, vector memory in small and large blocks is given in 8-byte 
units (for historical reasons), but memory in nodes is given in bytes - 
this is not documented/guaranteed in documentation. In 
summaryRprof(memory="both"), memory usage is given in megabytes as 
documented.
For summaryRprof(memory="stats" and memory="tseries") I clarified in 
r72743, now memory usage is in bytes and it is documented.
>
> and from the gc() documentation, the two rows represent
> - ‘"Ncells"’ (_cons cells_), usually 28 bytes each on 32-bit systems and 56
> bytes on 64-bit systems,
> - ‘"Vcells"’ (_vector cells_, 8 bytes each)
> (*Q2:* how are Ncells and Vcells related to small heap/large heap/memory in
> nodes?)
Ncells describe memory in nodes (Ncells is the number of nodes).

Vcells describe memory in "small heap" + "large heap". A Vcell today 
does not have much meaning, it is shown for historical reasons, but the 
interesting thing is that Vcells*56 (or 28 on 32-bit systems) gives the 
number of bytes in "small heap"+"large heap" objects.

> And I guess the question that lead to these other questions is - *Q3:* I'd
> like to plot out the total amount of memory used over time, and I don't
> think Rprofmem() give me what I'd like to know because, as I'm
> understanding it, Rprofmem() records the amount of memory allocated with
> each call, but this doesn't tell me the total amount of memory R is using,
> or am I mistaken?
Rprof controls a sampling profiler which regularly asks the GC how much 
memory is currently in use on the R heap (but beware, indeed some of 
that memory is no longer reachable but has not yet been collected - 
running gc more frequently helps, and some of the memory may still be 
reachable but will not be used anymore). You can get this data by 
summaryRprof(memory="tseries") and plot them - add columns 1+2 or 1+2+3 
depending on what you want, in 72743 or more recent, in older version 
you need to multiply columns 1 and 2 by 8. To run the GC more frequently 
you can use gctorture.

Or if you are happy modifying your own R code and you don't insist on 
querying the memory size very frequently, you can also explicitly call 
gc(verbose=T) repeatedly. For this you won't need to use the profiler.

If you were looking instead at how much memory the whole R instance was 
using (that is, including memory allocated by the R gc but not presently 
used for R objects, including memory outside R heap), the easiest way 
would be to use facilities of your OS.

Rprofmem is a different thing and won't help you.

Best
Tomas

>
> Thanks in advance!
>
> Joy
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list