[Rd] garbage collection, "preserved" variables, and different outcome depending on "--verbose" or not
Laurent Gautier
lgautier at gmail.com
Sun Jul 20 16:49:28 CEST 2008
2008/7/20 Duncan Murdoch <murdoch at stats.uwo.ca>:
> On 20/07/2008 10:02 AM, Laurent Gautier wrote:
>>
>> I tripped on that while crafting the example.
>>
>> The problem still exists when moving the "releases" in the middle,
>> and removing the last "release".
I also see that the C code contains old/irrelevant comments at the top
of the function
lostobject. Sorry about that, the bug I am chasing is elusive...
> I can't spot any problems in the new version of your code, but I can't
> reproduce the problem, either. So it appears to be system specific -- I was
> using the identical revision as you, but on Windows, not
> "x86_64-unknown-linux-gnu". What exact Linux is that?
Ubuntu (hardy heron).
gcc is 4.2.3
The R source was built with "make clean; ./configure --enable-R-shlib"
before running "make; make install"
> Can others using that system (or similar ones) reproduce it?
>
> Duncan Murdoch
>
>>
>>
>> #include <R.h>
>> #include <Rdefines.h>
>>
>>
>> SEXP createObject(void)
>> {
>> SEXP x_R;
>> int len_x = 1000000;
>> PROTECT(x_R = allocVector(REALSXP, len_x));
>> Rprintf("Created 'x' at %p\n", x_R);
>> Rprintf(" (mode is %i, length is %i)\n", TYPEOF(x_R), LENGTH(x_R));
>> Rprintf(" (first element is %d)\n", REAL(x_R)[0]);
>> R_PreserveObject(x_R);
>> UNPROTECT(1);
>> return x_R;
>> }
>>
>> void printObject(SEXP sexp)
>> {
>> Rprintf("object at %p\n", sexp);
>> Rprintf(" (mode is %i, length is %i, named is %i)\n",
>> TYPEOF(sexp), LENGTH(sexp), NAMED(sexp));
>> }
>>
>> SEXP lostobject(SEXP n_R)
>> {
>> /*
>> * This function will:
>> * 1- create a numerical vector "x" and "preserve it"
>> * 2- make call "list(x)"
>> * 3- return "x" to R
>> */
>>
>>
>> SEXP x_R;
>> int i;
>>
>> int n = INTEGER(n_R)[0];
>>
>> /* Create a numerical vector "x_R" */
>>
>> for (i=0; i<n; i++) {
>> x_R = createObject();
>> printObject(x_R);
>> R_ReleaseObject(x_R);
>> R_gc();
>> }
>>
>> x_R = createObject();
>> printObject(x_R);
>> R_gc();
>>
>> Rprintf("Returning 'x' at %p\n", x_R);
>> Rprintf(" (first element is %d)\n", REAL(x_R)[0]);
>> return x_R;
>> }
>>
>>
>> 2008/7/20 Duncan Murdoch <murdoch at stats.uwo.ca>:
>>>
>>> On 20/07/2008 9:01 AM, Laurent Gautier wrote:
>>>>
>>>> Dear list,
>>>>
>>>> While trying to identify the root of a problem I am having with
>>>> garbage collected variables,
>>>> I have come across the following oddity: depending on whether --verbose
>>>> is
>>>> set
>>>> or not, I obtain different results.
>>>
>>> You are working with variables without protecting them, so you just get
>>> lucky whenever the code works.
>>>
>>> More below...
>>>
>>>> I have made a small standalone example to demonstrate it.
>>>> The example is very artificial, but I had a hard time reproducing
>>>> reliably the problem.
>>>>
>>>> So when I do: (the content of test.R is at the end of this email)
>>>>
>>>> R --no-save < test.R
>>>>
>>>> [The two last lines of the output are:]
>>>>>
>>>>> x[1:3]
>>>>
>>>> [1] 0 0 0
>>>>
>>>> while with
>>>>
>>>> R --verbose --no-save < test.R
>>>>
>>>> [The two last lines of the output are:]
>>>>>
>>>>> x[1:3]
>>>>
>>>> [1] 3.733188e-317 3.137345e-317 3.137345e-317
>>>>
>>>>
>>>> The C code is compiled with:
>>>> R CMD SHLIB test_lostobject.c
>>>>
>>>>
>>>>> sessionInfo()
>>>>
>>>> R version 2.7.1 Patched (2008-07-19 r46081)
>>>> x86_64-unknown-linux-gnu
>>>>
>>>> locale:
>>>>
>>>>
>>>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods base
>>>>
>>>>
>>>>
>>>> ### -- file test.R
>>>>
>>>> dyn.load("test_lostobject.so")
>>>>
>>>> x = .Call("lostobject", as.integer(5))
>>>>
>>>> x[1:3]
>>>>
>>>>
>>>> ### ---
>>>>
>>>> ###--- file lostobject.c
>>>>
>>>> #include <R.h>
>>>> #include <Rdefines.h>
>>>>
>>>>
>>>>
>>>> SEXP createObject(void)
>>>> {
>>>> SEXP x_R;
>>>> int len_x = 1000000;
>>>> PROTECT(x_R = allocVector(REALSXP, len_x));
>>>> Rprintf("Created 'x' at %p\n", x_R);
>>>> Rprintf(" (mode is %i, length is %i)\n", TYPEOF(x_R), LENGTH(x_R));
>>>> Rprintf(" (first element is %d)\n", REAL(x_R)[0]);
>>>> R_PreserveObject(x_R);
>>>> UNPROTECT(1);
>>>> return x_R;
>>>> }
>>>>
>>>> void printObject(SEXP sexp)
>>>> {
>>>> Rprintf("object at %p\n", sexp);
>>>> Rprintf(" (mode is %i, length is %i, named is %i)\n",
>>>> TYPEOF(sexp), LENGTH(sexp), NAMED(sexp));
>>>> }
>>>>
>>>> SEXP lostobject(SEXP n_R)
>>>> {
>>>> /*
>>>> * This function will:
>>>> * 1- create a numerical vector "x" and "preserve it"
>>>> * 2- make call "list(x)"
>>>> * 3- return "x" to R
>>>> */
>>>>
>>>>
>>>> SEXP x_R;
>>>> int i;
>>>>
>>>> int n = INTEGER(n_R)[0];
>>>>
>>>> /* Create a numerical vector "x_R" */
>>>>
>>>> for (i=0; i<n; i++) {
>>>> x_R = createObject();
>>>> R_ReleaseObject(x_R);
>>>
>>> At this point, the variable is unprotected, i.e. you have declared that
>>> its
>>> memory is free for the taking. You should not try to do anything with
>>> it.
>>> printObject calls several functions, and one of those may have
>>> overwritten
>>> the memory. It's not surprising that different flags (--verbose or not)
>>> result in different behaviour.
>>>
>>>> printObject(x_R);
>>>> R_gc();
>>>> }
>>>>
>>>> x_R = createObject();
>>>> printObject(x_R);
>>>> R_gc();
>>>> R_ReleaseObject(x_R);
>>>
>>> Same thing here. x_R is unprotected now, so you shouldn't use it.
>>>
>>> Duncan Murdoch
>>>
>>>> Rprintf("Returning 'x' at %p\n", x_R);
>>>> Rprintf(" (first element is %d)\n", REAL(x_R)[0]);
>>>> return x_R;
>>>> }
>>>>
