[R] Is it possible to avoid copying arrays when calling list()?

MRipley mrip027 at gmail.com
Fri Aug 16 18:16:07 CEST 2013


Usually R is pretty good about not copying objects when it doesn't need 
to.  However, the list() function seems to make unnecessary copies.  For 
example:

 > system.time(x<-double(10^9))
    user  system elapsed
   1.772   4.280   7.017
 > system.time(y<-double(10^9))
    user  system elapsed
   2.564   3.368   5.943
 > system.time(z<-list(x,y))
    user  system elapsed
   5.520   6.748  12.304

I have a function where I create two large arrays, manipulate them in 
certain ways, and then return both as a list.  I'm optimizing the 
function, so I'd like to be able to build the return list quickly.  The 
two large arrays drop out of scope immediately after I make the list and 
return it, so copying them is completely unnecessary.

Is there some way to do this?  I'm not familiar with manipulating lists 
through the .Call interface, and haven't been able to find much about 
this in the documentation.  Might it be possible to write a fast (but 
possibly unsafe) list function using .Call that doesn't make copies of 
the arguments?

PS A few things I've tried.  First, this is not due to triggering 
garbage collection -- even if I call gc() before list(x,y), it still 
takes a long time.

Also, I've tried rewriting the function by creating the list at the 
beginning as in:
result <- list(x=double(10^9),y=double(10^9))
and then manipulating result$x and result$y but this made my code run 
slower, as R seemed to be making other unnecessary copies while 
manipulating elements of a list like this.

I've considered (though not implemented) creating an environment rather 
than a list, and returning the environment, but I'd rather find a simple 
way of creating a list without making copies if possible.



More information about the R-help mailing list