[Rd] how to control the environment of a formula

Gabor Grothendieck ggrothendieck at gmail.com
Sat Apr 20 20:17:11 CEST 2013


On Sat, Apr 20, 2013 at 1:44 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
> On 13-04-19 2:57 PM, Thomas Alexander Gerds wrote:
>>
>>
>> hmm. I have tested a bit more, and found this perhaps more difficult
>> solve situation. even though I delete x, since x is part of the output
>> of the formula, the size of the object is twice as much as it should be:
>>
>> test <- function(x){
>>    x <- rnorm(1000000)
>>    out <- list(x=x)
>>    rm(x)
>>    out$f <- as.formula(a~b)
>>    out
>> }
>> v <- test(1)
>> x <- rnorm(1000000)
>> save(v,file="~/tmp/v.rda")
>> save(x,file="~/tmp/x.rda")
>> system("ls -lah ~/tmp/*.rda")
>>
>> -rw-rw-r-- 1 tag tag  15M Apr 19 20:52 /home/tag/tmp/v.rda
>> -rw-rw-r-- 1 tag tag 7,4M Apr 19 20:52 /home/tag/tmp/x.rda
>>
>> can you solve this as well?
>
>
> Yes, this is tricky.  The problem is that "out" is in the environment of
> out$f, so you get two copies when you save it.  (I think you won't have two
> copies in memory, because R only makes a copy when it needs to, but I
> haven't traced this.)
>
> Here are two solutions, both have some problems.
>
> 1.  Don't put out in the environment:
>
>
> test <- function(x) {
>   x <- rnorm(1000000)
>   out$x <- list(x=x)
>   out$f <- a ~ b    # the as.formula() was never needed
>   # temporarily create a new environment
>   local({
>     # get a copy of what you want to keep
>     out <- out
>     # remove everything that you don't need from the formula
>     rm(list=c("x", "out"), envir=environment(out$f))
>     # return the local copy
>     out
>   })
> }
>
> I don't like this because it is too tricky, but you could probably wrap the
> tricky bits into a little function (a variant on return() that cleans out
> the environment first), so it's probably what I would use if I was desperate
> to save space in saved copies.
>
> 2. Never evaluate the formula in the first place, so it doesn't pick up the
> environment:
>
>
> test <- function(x) {
>   x <- rnorm(1000000)
>   out$x <- list(x=x)
>   out$f <- quote(a ~ b)
>   out
> }
>
> This is a lot simpler, but it might not work with some modelling functions,
> which would be confused by receiving the model formula unevaluated.  It also
> has the problems that you get with using .GlobalEnv as the environment of
> the formula, but maybe to a slightly lesser extent:  rather than having what
> is possibly the wrong environment, it doesn't have one at all.

An approach along the lines of Duncan's last solution that works with
lm but may or may not work with other regression-style functions is to
use a character string:

fit <- lm("demand ~ Time", BOD)

As long as you are only saving the input you should be OK but if you
are saving the output of lm then you are back to the same problem
since the "lm" object will contain a formula.

> class(formula(fit))
[1] "formula"



More information about the R-devel mailing list