[Rd] internal copying in R (soon to be released R-3.1.0

Jens Oehlschlägel Jens.Oehlschlaegel at truecluster.com
Sun Mar 2 18:37:59 CET 2014


Dear core group,

Which operation in R guarantees to get a true copy of an atomic vector, 
not just a second symbol pointing to the same shared memory?

y <- x[]
#?

y <- x
y[1] <- y[1]
#?

Is there any function that returns its argument as a non-shared atomic 
but only copies if the argument was shared?

Given an atomic vector x, what is the best official way to find out 
whether other symbols share the vector RAM? Querying NAMED() < 2 doesn't 
work because .Call sets sxpinfo_struct.named to 2. It even sets it to 2 
if the argument to .Call was a never-named expression!?

 > named(1:3)
[1] 2

And it seems to set it permanently, pure read-access can trigger 
copy-on-modify:

 > x <- integer(1e8)
 > system.time(x[1]<-1L)
        User      System verstrichen
           0           0           0
 > system.time(x[1]<-2L)
        User      System verstrichen
           0           0           0

having called .Call now leads to an unnecessary copy on the next assignment

 > named(x)
[1] 2
 > system.time(x[1]<-3L)
        User      System verstrichen
        0.14        0.07        0.20
 > system.time(x[1]<-4L)
        User      System verstrichen
           0           0           0

this not only happens with user written functions doing read-access

 > is.unsorted(x)
[1] TRUE
 > system.time(x[1]<-5L)
        User      System verstrichen
        0.11        0.09        0.21

Why don't you simply give package authors read-access to 
sxpinfo_struct.named in .Call (without setting it to 2)? That would give 
us more control and also save some unnecessary copying. I guess once R 
switches to reference-counting preventive increasing in .Call could not 
be continued anyhow.

Kind regards


Jens Oehlschlägel

P.S. please cc me in answers as I am not member of r-devel


P.P.S. function named() was tentatively defined as follows:

named <- function(x)
   .Call("R_bit_named", x, PACKAGE="bit")

SEXP R_bit_named(SEXP x){
   SEXP ret_;
   PROTECT( ret_ = allocVector(INTSXP,1) );
   INTEGER(ret_)[0] = NAMED(x);
   UNPROTECT(1);
   return ret_;
}


 > version
                _
platform       x86_64-w64-mingw32
arch           x86_64
os             mingw32
system         x86_64, mingw32
status         Under development (unstable)
major          3
minor          1.0
year           2014
month          02
day            28
svn rev        65091
language       R
version.string R Under development (unstable) (2014-02-28 r65091)
nickname       Unsuffered Consequences



More information about the R-devel mailing list