[Rd] how to control the environment of a formula

Duncan Murdoch murdoch.duncan at gmail.com
Sat Apr 20 19:44:35 CEST 2013

On 13-04-19 2:57 PM, Thomas Alexander Gerds wrote:
> hmm. I have tested a bit more, and found this perhaps more difficult
> solve situation. even though I delete x, since x is part of the output
> of the formula, the size of the object is twice as much as it should be:
> test <- function(x){
>    x <- rnorm(1000000)
>    out <- list(x=x)
>    rm(x)
>    out$f <- as.formula(a~b)
>    out
> }
> v <- test(1)
> x <- rnorm(1000000)
> save(v,file="~/tmp/v.rda")
> save(x,file="~/tmp/x.rda")
> system("ls -lah ~/tmp/*.rda")
> -rw-rw-r-- 1 tag tag  15M Apr 19 20:52 /home/tag/tmp/v.rda
> -rw-rw-r-- 1 tag tag 7,4M Apr 19 20:52 /home/tag/tmp/x.rda
> can you solve this as well?

Yes, this is tricky.  The problem is that "out" is in the environment of 
out$f, so you get two copies when you save it.  (I think you won't have 
two copies in memory, because R only makes a copy when it needs to, but 
I haven't traced this.)

Here are two solutions, both have some problems.

1.  Don't put out in the environment:

test <- function(x) {
   x <- rnorm(1000000)
   out$x <- list(x=x)
   out$f <- a ~ b    # the as.formula() was never needed
   # temporarily create a new environment
     # get a copy of what you want to keep
     out <- out
     # remove everything that you don't need from the formula
     rm(list=c("x", "out"), envir=environment(out$f))
     # return the local copy

I don't like this because it is too tricky, but you could probably wrap 
the tricky bits into a little function (a variant on return() that 
cleans out the environment first), so it's probably what I would use if 
I was desperate to save space in saved copies.

2. Never evaluate the formula in the first place, so it doesn't pick up 
the environment:

test <- function(x) {
   x <- rnorm(1000000)
   out$x <- list(x=x)
   out$f <- quote(a ~ b)

This is a lot simpler, but it might not work with some modelling 
functions, which would be confused by receiving the model formula 
unevaluated.  It also has the problems that you get with using 
.GlobalEnv as the environment of the formula, but maybe to a slightly 
lesser extent:  rather than having what is possibly the wrong 
environment, it doesn't have one at all.

Duncan Murdoch

> thanks!
> thomas
> Duncan Murdoch <murdoch.duncan at gmail.com> writes:
>> On 13-04-18 11:39 AM, Thomas Alexander Gerds wrote:
>>> Dear Duncan
>>> thank you for taking the time to answer my questions! It will be
>>> quite some work to delete all the objects generated inside the
>>> function ... but if there is no other way to avoid a large
>>> environment then this is what I will do.
>> It's not really that hard.  Use names <- ls() in the function to get a
>> list of all of them; remove the names of variables that might be
>> needed in the formula (and the name of the formula itself); then use
>> rm(list=names) to delete everything else just before returning it.
>> Duncan Murdoch

More information about the R-devel mailing list