[R] Memory Fragmentation in R

Nawaaz Ahmed nawaaz at inktomi.com
Sat Feb 19 18:18:36 CET 2005


I have a data set of roughly 700MB which during processing grows up to 
2G ( I'm using a 4G linux box). After the work is done I clean up (rm()) 
and the state is returned to 700MB. Yet I find I cannot run the same 
routine again as it claims to not be able to allocate memory even though 
gcinfo() claims there is 1.1G left.

	At the start of the second time
	===============================
           	 used  (Mb) gc trigger   (Mb)
	Ncells  2261001  60.4    3493455   93.3
	Vcells 98828592 754.1  279952797 2135.9

	Before Failing
	==============
	Garbage collection 459 = 312+51+96 (level 0) ...
	1222596 cons cells free (34%)
	1101.7 Mbytes of heap free (51%)
	Error: cannot allocate vector of size 559481 Kb

This looks like a fragmentation problem. Anyone have a handle on this 
situation? (ie. any work around?) Anyone working on improving R's 
fragmentation problems?

On the other hand, is it possible there is a memory leak? In order to 
make my functions work on this dataset I tried to eliminate copies by 
coding with references (basic new.env() tricks). I presume that my 
cleaning up returned the temporary data (as evidenced by the gc output 
at the start of the second round of processing). Is it possible that it 
was not really cleaned up and is sitting around somewhere even though 
gc() thinks it has been returned?

Thanks - any clues to follow up will be very helpful.
Nawaaz




More information about the R-help mailing list