[Rd] Return values from .Call and garbage collection

Sklyar, Oleg (London) osklyar at maninvestments.com
Tue Jan 27 13:25:12 CET 2009


- R is not multithreaded (or so it was) and thus race condition cannot
occur
- I would think there is no call to GC at the time of assignment of the
return value to a variable. GC is only called within other R calls as R
as mentioned above is not multithreaded

Most likely issue is your code itself, out of range indexing, failure to
initialise all elements of the allocated structure correctly, 1 and not
0-based indexing, use of other R variables for initialisation that
should have been protected but were not etc. 

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3107
osklyar at maninvestments.com 

> -----Original Message-----
> From: r-devel-bounces at r-project.org 
> [mailto:r-devel-bounces at r-project.org] On Behalf Of Jon Senior
> Sent: 27 January 2009 12:09
> To: r-devel at r-project.org
> Subject: [Rd] Return values from .Call and garbage collection
> 
> Hi all,
> 
> I'm posting this here as it discusses an issue with an 
> external C library. If it would be better in R-Help, then I'll repost.
> 
> I'm using an external library which I've written, which 
> provides a large set of data (>500MB in a highly condensed 
> format) and the tools to return values from the data. The 
> functionality has been tested call by call and using valgrind 
> and works fine, with no memory leaks. After retrieval, I 
> process the data in R. A specific function is causing a 
> problem that appears to be related to the garbage collector 
> (judging by symptoms).
> 
> In the C code, a Matrix is created using
> 
> PROTECT(retVal = allocMatrix(INTSXP, x, y));
> 
> Values are written into this matrix using
> 
> INTEGER(retVal)[translatedOffset]=z;
> 
> where "translatedOffset" is a conversion from a row/column 
> pair to an offset as shown in R-exts.pdf.
> 
> The last two lines of the function call are:
> 
> UNPROTECT(1);
> return retVal;
> 
> The shared library was compiled with R CMD SHLIB and is 
> called using .Call.
> 
> Which returns our completed SEXP object to R where processing 
> continues.
> 
> In R, we continue to process the data, replacing -1s with NAs 
> (I couldn't find a way to do that in that would make it back 
> into R), sorting it, and trimming it. All of these operations 
> are carried out on the original data.
> 
> If I carry out the processing step by step from the 
> interpreter, everything is fine and the data comes out how I 
> would expect. But when I run the R code to carry out those 
> steps, every now and again (Around 1/5th of the time), the 
> returned data is garbage. I'm expecting to receive a bias per 
> iteration that should be -5 <= bias <= 5, but for the 
> garbaged data, I'm getting results of the order of 100s of 
> thousands out (eg. -220627.7). If I call the routine which 
> carries out the processing for one iteration from the 
> intepreter, sometimes I get the correct data, sometimes (with 
> the same frequency) I get garbage.
> 
> There are two possibilities that I can envisage.
> 1) Race condition: R is starting to execute the R code after 
> the .Call before the .Call has returned, thus the data is corrupted.
> 2) Garbage collector: the GC is collecting my data between 
> the UNPROTECT(1); call and the assignment to an R variable.
> 
> The created matrices can be large (where x > 1000, y > 
> 100000), but the garbage doesn't appear to be related to the 
> size of the matrix.
> 
> Any ideas what steps I could take to proceed with this? Or 
> other possibilities than those I've suggested? For reasons of 
> confidentiality I'm unable to release test code, and the 
> large dataset might make testing difficult.
> 
> Thanks in advance
> 
> -- 
> Jon Senior <jon at restlesslemon.co.uk>
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

**********************************************************************
Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees ...{{dropped:19}}



More information about the R-devel mailing list