[Rd] [R] Suspected memory leak with R v.2.5.x and large matrices with dimnames set
luke at stat.uiowa.edu
Sun Aug 19 00:00:08 CEST 2007
Seth's analysis is correct. R does return what it can to the malloc
system by calling free. When and how much memory malloc releases back
to the OS varies with OS and malloc system and also depends on the
sizes of allocations. R curently allocates its memory for small
objects in pages of about 2K. On Mac OS X if that is increased to
about 16K then much more is returned to the OS. On Linux (Fedora 7 on
i386 at least) the amount would have to be pushed up to around 2M to
make a difference. Increasing page size reduces R's ability to release
pages, so an increase to that level would probably not be a good idea.
Whether or not malloc releases memory back to the OS shouldn't make
much difference to a single R process; it might come into play if you
are trying to run multiple memory-intensive pocesses on the same
machine, though even that may vary among OS/malloc systems.
The changes Seth mentiones are not likely to help in this case. They
are primarily intended to improve performance when there are many
non-unique character vectors; there is additional overhead for many
unique vectors, which we will try to reduce over time.
On Sat, 18 Aug 2007, Peter Waltman wrote:
> Hi Seth -
> Thanks for the follow up. I'll definitely check out the devel version
> at some point since while I've come up with a workaround, this is
> causing problems for me as it uses up so much memory on some systems
> that R starts throwing malloc errors and has to be killed from the
> command line. The machine I'm thinking of in particular is a MacOS
> machine with 8 gigs of memory.
> Also, having the row and column names set to alphanumeric names causes
> the processing to slow down significantly - as much as by a power of 10
> (or more).
> As for you speculation that the memory released by R may not be
> recognized as being free'd by the OS, as a further test, I re-ran my
> code snippet three consecutive times w/in the same R interpreter
> window. In theory, if there were a memory leak, after the first run
> (resulting in a memory stamp of 2 gig), the subsequent runs would
> further increase R's memory stamp, i.e. up to 4 after the second, and 6
> for the 3rd.
> This didn't happen, and R's stamp remained at 2 gig, so I can only
> assume that you're correct and I was wrong about a leak.
> Still, it's quite the memory hog when using dimnames, so I'll have to
> avoid those for now and will try the devel version you mentioned.
> Thanks and have a good weekend,
> Seth Falcon wrote:
>> Hi Peter,
>> Peter Waltman <waltman at cs.nyu.edu> writes:
>>> Admittedly, this may not be the most sophisticated memory profiling
>>> performed, but when using unix's top command, I'm noticing a notable
>>> memory leak when using R with a large matrix that has dimnames
>> I'm not sure I understand what you are reporting. One thing to keep
>> in mind is that how memory released by R is handled is OS dependent
>> and one will often observe that after R frees some memory, the OS does
>> not report that amount as now free.
>> Is what you are observing preventing you from getting things done, or
>> just a concern that there is a leak that needs fixing? It is worth
>> noting that the internal handling of character vectors has changed in
>> R-devel and so IMO testing there would make sense before persuing this
>> further, I suspect your results will be different.
>> + seth
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
241 Schaeffer Hall email: luke at stat.uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
More information about the R-devel