[Rd] can one modify array in R memory from C++ without copying it?

Wed Nov 2 02:29:41 CET 2011

On Nov 1, 2011, at 9:08 PM, andre zege wrote:

> Hi, guys. I posted this by accident at rcpp-dev, although it meant to
> be only to r-dev, so don't flame me here please, rcpp guys will
> do it there, i am sure :).
> I have some pretty large arrays in R and i wanted to do some time
> consuming modifications of these arrays in C++ without actually copying
> them, just by passing pointers to them. Since i don't know internal data
> structures of R, i am not sure it's possible, but i thought it was. Here is
> some toy code that i thought should work, but doesn't. Maybe someone could
> point out the error i am making
> 
> i have the following in the passptr.cpp to multiply array elements by 2
> ===============================
> extern "C"{
> void modify(double *mem, int *nr, int *nc){
>  for(int i=0; i< (*nr)*(*nc); i++)
>    mem[i]=2*mem[i];
>   }
> }
> 
> ----------------------------------------------
> I compile it into a shared library using
> R CMD SHLIB passptr.cpp
> load and run from R as follows
> 
> --------------------------------
> 
>> dyn.load("/home/az05625/testarma/passptr.so")
> 
>> m<-matrix(1:10,nr=2)
> 
>> .C("modify", as.double(m), as.integer(2), as.integer(5), DUP=FALSE)
> 
>> From reading docs i thought that DUP=FALSE would ensure that R matrix is
> not copied and is multiplied by 2 in place. However, it's not the case,
> matrix m is the same after calling .C("modify"...)
> 

Since you called as.double() you created another copy that was modified, you are not passing m to the C code. The result of the .C() call will have the modified version.

See recent discussion on this list - .C is always less efficient than .Call anyway.

> as it was before. Am i calling incorrectly, or is it just impossible to
> modify R matrix in place from C++? Would greatly appreciate any pointers.
> 

The real answer it you don't want to do it. It is technically possible with .Call in some instances, but it is extremely dangerous, because you may modify other, unrelated (other than by value) objects. Note that R uses pass-by-value semantics throughout, so it should not be possible. For example foo(x) takes the value of x, the function foo() has absolutely no idea that the value comes from the binding of x, it is just the value, so it can't modify x. It is technically possible only because R does some optimizations and doesn't copy x if not needed, but it assumes that everyone plays along. However, exploiting that can have unwanted effects:

> library(inline)
> foo = cfunction(signature(x="integer"), "INTEGER(x)[0]=1; return x;")
> a=0L
> b=a
> foo(a)
[1] 1
> a
[1] 1
> b
[1] 1

As you can see "b" is modified even though it was not involved in the call at all! Tracing such issues can be a nightmare, so the answer is "you don't want to do it".

Cheers,
Simon