[R] Function to modify existing data.frame--Improving R Language

Peter Muhlberger peterm at andrew.cmu.edu
Wed Jan 19 22:36:00 CET 2005


Thomas & Jeff:  Thanks again for your thoughts.  The program Thomas suggests
below is elegant, but I was avoiding that because I assumed the memory
requirements and amount of time required for a large dataset would be
substantial.  Of course, it depends on what's happening 'under the hood.'
Perhaps mydata doesn't get copied and then replaced w/ a modified copy of
itself, as it seems.  R might simply have one copy & a list of updates in
memory.  I tried something like the program below w/ my data & it only takes
a couple seconds, so this looks like the elegant solution to my problem.
Thomas is right that there is a programming advantage to pass by value,
though I wonder whether for complex programming it would be enough to allow
a function to modify only one workspace object at a time.  I guess I'll see.
R is very slick.

Peter

On 1/19/05 1:31 PM, "Thomas Lumley" <tlumley at u.washington.edu> wrote:

> I don't see why
>    mydata <- some.program(mydata)
> is much less elegant than
>    mydata.someProgram()
> as a way of updating a data set. It may use more memory, but that wasn't
> the point at issue.
> 
> Of course there are advantages to the ability to pass by reference, and
> disadvantages -- the most obvious disadvantage is that it is not easy to
> tell which variables are modified by a given piece of code.
> 
> It probably wouldn't be that hard to produce something that looked like a
> data frame but was passed by reference, by wrapping it in a environment.




More information about the R-help mailing list