[R] Passing references to data objects into R functions

Henrik Bengtsson hb at maths.lth.se
Thu Jul 24 12:21:58 CEST 2003


One way is to use an object-oriented design and wrap up the reference
functionality in a common superclass. At
http://www.maths.lth.se/help/R/ImplementingReferences/ I have got some
discussions which are in line what you are trying to achieve and that
you might be able to adopt.

Also, note that passing huge objects as arguments to functions is NOT
expensive (considering memory or time) in R if they are used for
read-only purposes. It only becomes expensive if you assign a new value
to the argument. In such cases R *has to* copy the whole object to make
sure you only modify a local instance of the object. Thus, objects can
be though of being passed by reference to functions as long as they are
not modified, if modified they are passed by value. This is intentional
as R is a (one-threaded) functional language. 

Best wishes

Henrik Bengtsson
Lund University

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of David 
> Khabie-Zeitoune
> Sent: den 23 juli 2003 18:18
> To: r-help at r-project.org
> Subject: [R] Passing references to data objects into R functions
> 
> 
> Hi. 
> 
> I have the following question about reading from large data 
> objects from within R functions; I have tried to simplify my 
> problem as much as possible in what follows.
> 
> Imagine I have various large data objects sitting in my 
> global environment (call them "data1", "data2", ...).  I want 
> to write a function "extract" that extracts some of the rows 
> of a particular data object, does some further manipulations 
> on the extract and then returns the result. The function 
> takes the data object's name and an index vector -- for 
> example the following call would return the first 3 rows of 
> object data1. 
> 
> ans = extract("data1", 1:3)
> 
> I could write a simple function like this:
> 
> extract1 = function(object.name, index) {
> 
>     temp = get(object.name, envir = .GlobalEnv)
>     temp = temp[index, , drop=FALSE]
> 
>     # do some further manipulations here ....
> 
>     return(temp)
> 
> }
> 
> The problem is that the function makes a copy "temp" of the 
> object in the function frame, which (in my application) is 
> very memory inefficient as the data objects are very large. 
> It is especially inefficient when the length of the "index" 
> vector is much smaller than the number of rows in the data 
> object. What I really would like to do is to be able to read 
> from the underlying data object directly (in other 
> programming languages this would be achieved by passing a 
> pointer to the object instead), without making a copy.
> 
> Given the rules of variable name scoping in R, I could avoid 
> making a copy with the following call:
> 
> extract2 = function(object.name, index) {
> 
>     eval(parse(text = "temp = ", object.name, "[index, , drop=FALSE]",
> sep=""))
>     # do some further manipulations here ....
> 
>     return(temp)
> }
> 
> But this seems very messy. Is there a better way?
> 
> Thanks for your help
> 
> David Khabie-Zeitoune
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help




More information about the R-help mailing list