[Rd] Moderating consequences of garbage collection when in C

dhinds at sonic.net dhinds at sonic.net
Thu Nov 10 07:12:11 CET 2011


Martin Morgan <mtmorgan at fhcrc.org> wrote:
> Allocating many small objects triggers numerous garbage collections as R 
> grows its memory, seriously degrading performance. The specific use case 
> is in creating a STRSXP of several 1,000,000's of elements of 60-100 
> characters each; a simplified illustration understating the effects 
> (because there is initially little to garbage collect, in contrast to an 
> R session with several packages loaded) is below.

What a coincidence -- I was just going to post a question about why it
is so slow to create a STRSXP of ~10,000,000 unique elements, each ~10
characters long.  I had noticed that this seemed to show much worse
than linear scaling.  I had not thought of garbage collection as the
culprit -- but indeed it is.  By manipulating the GC trigger, I can
make this operation take as little as 3 seconds (with no GC) or as
long as 76 seconds (with 31 garbage collections).

-- Dave



More information about the R-devel mailing list