[Rd] R scripts slowing down after repeated called to compiled code

Michael Braun braunm at MIT.EDU
Sat May 26 15:16:22 CEST 2007


Oleg:

No, I'm not using any temp files.  The only external library I use is the
GSL library, and I have counted, and re-counted, my gsl_matrix(vector)_alloc
and gsl_matrix(vector)_free  statements to be sure that they balance.

I cases when they weren't balanced, memory usage would increase very
rapidly.  That does not seem to be happening here.

What about setting the minimum size of the vector heap very large (say,
4GB?).  Might that help?  I really don't understand how that works, or what
the output of the gc() statement means, to help me diagnose the problem.

Thanks,

Michael

-----Original Message-----
From: Oleg Sklyar [mailto:osklyar at ebi.ac.uk] 
Sent: Saturday, May 26, 2007 7:42 AM
To: braunm at MIT.EDU
Cc: r-devel at r-project.org
Subject: Re: [Rd] R scripts slowing down after repeated called to compiled
code

I work with images with a lot of processing done in C code. Quite often I
allocate memory there up to several gigs in chunks of 10-15 Mb each plus
hundreds of protected dims, names etc. I had a similar problem only once
when due to some erroneous use of an external library, internally created
objects were not freed correctly. Otherwise, after correcting this, I never
have seen any slow down on large number of objects created and manipulated.
And then, it was so difficult to track the memory leak that I would really
suggest to double and triple check all the memory allocations. Your code
does not use any temp files? This could be a real pain. Oleg

Dirk Eddelbuettel wrote:
> On 25 May 2007 at 19:12, Michael Braun wrote:
> | So I'm stuck.  Can anyone help?
> 
> It sounds like a memory issue. Your memory may just get fragmented. 
> One tool that may help you find leaks is valgrind -- see the 'R 
> Extensions' manual. I can also recommend the visualisers like kcachegrind
(part of KDE).
> 
> But it may not be a leak. I found that R just doesn't cope well with 
> many large memory allocations and releases -- I often loop over data 
> request that I subset and process. This drives my 'peak' memory use to 
> 1.5 or 1.7gb on 32bit/multicore machine with 4gb, 6gb or 8gb (but 
> 32bit leading to the hard 3gb per process limit) .  And I just can't 
> loop over many such task.  So I now use the littler frontend to script 
> this, dump the processed chunks as Rdata files and later re-read the
pieces. That works reliably.
> 
> So one think you could try is to dump your data in 'gsl ready' format 
> from R, quit R, leave it out of the equation and then see if what 
> happens if you do the iterations in only GSL and your code.
> 
> Hth, Dirk
> 

--
Dr Oleg Sklyar | EBI-EMBL, Cambridge CB10 1SD, UK | +44-1223-494466



More information about the R-devel mailing list