[Rd] full copy on assignment?

John Chambers jmc at r-project.org
Sun Apr 4 02:54:58 CEST 2010


In particular, Duncan's comment applies in situations where the 
replacement is in a loop, obviously the case one worries about.

What happens in the stupid little function:

 > foo <- function(x) { for(i in seq_along(x)) x[i] <- x[i] +1; x}

for the case

 > y <- 1:1e6
 > y1 <- foo(y)

How often does y get duplicated? Hopefully not a million times.  One can 
look at this in gdb, by trapping calls to duplicate1.  The answer is:  
just once, to ensure that the object is local.  Then the duplicated 
version has only one reference and the primitive replacement doesn't 
copy it.

Unfortunately, as Duncan said, changing the definition to a user-written 
replacement function:

 > "sub<-" <- function(x,i, value){x[i]<- value; x}
 > foo <- function(x) { for(i in seq_along(x)) sub(x,i) <- x[i]+1; x}

does duplicate a million times, since every call to `sub<-` gets an 
argument with two references.

John



On 4/3/10 4:42 PM, Duncan Murdoch wrote:
> On 03/04/2010 6:34 PM, Norm Matloff wrote:
>> Here's a basic question that doesn't seem to be completely answered in
>> the docs, and which unfortunately I've not had time to figure out by
>> wading through the R source code:
>>
>> In a vector (or array) element assignment such as
>>    z[3] <- 8
>> is there in actuality a full rewriting of the entire vector pointed to
>> by z, as implied by
>>
>>    z <- "[<-"(z,3,value=8)
>>
>> Assume that an element of z has already being changed previously, so
>> that copy-on-change issues don't apply, with z being reassigned back to
>> the same memory address.
>>
>> I seem to recall reading somewhere that recent R versions make some
>> attempt to avoid rewriting the entire vector, and my timing experiments
>> seem to suggest that it's true.
>> So, is a full rewrite avoided?  And where in the source code is this
>> done?
>
> It depends.  User-written assignment functions can't avoid the copy. 
> They act like the expansion
>
> z <- "[<-"(z,3,value=8)
>
> and in that, R can't tell that the newly created result of 
> "[<-"(z,3,value=8) will later overwrite z.
>
> However, if z is a regular vector without a class and you're using the 
> built-in version of z[3] <- 8, it can take some shortcuts.  This 
> happens in multiple places; one is around line 488 of subassign.c 
> another is around line 1336.  In each of these places copies are made 
> in some circumstances, but not in general.
>
> Duncan Murdoch
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list