[R] behaviour of formula objects and environment inside functions

William Dunlap wdunlap at tibco.com
Thu Mar 21 04:35:57 CET 2013


I didn't see where you said what your goal was in making the environment of
a formula and empty environment.  I'm guessing that you want to make sure
the variables in the formula come from the data.frame given to a fitting function
along with the formula (so that typos cause errors for sure instead of sometimes
giving an incorrect answer).

Note that environment(formula) is used to look up not only the variables (and
functions) in a formula, but also to look up some things used in a call to model.frame.
Hence setting the formula's environment to emptyenv() is not very useful - it
limits things too much.

  > form1 <- y ~ x1 + x2
  > environment(form1) <- emptyenv()
  > dat <- data.frame(y=log(1:10), x1=1/(1:10), x2=sqrt(1:10))
  > fit <- lm(form1, data=dat)
  Error in eval(expr, envir, enclos) : could not find function "list"
  > traceback()
  7: eval(expr, envir, enclos)
  6: eval(predvars, data, env)
  5: model.frame.default(formula = form1, data = dat, drop.unused.levels = TRUE)
  4: model.frame(formula = form1, data = dat, drop.unused.levels = TRUE)
  3: eval(expr, envir, enclos)
  2: eval(mf, parent.frame())
  1: lm(form1, data = dat)

I'm a bit surprised that this error happens  - it might be avoided by rewriting
some stuff in model.frame.  I can avoid it by doing
  > e <- new.env(parent=emptyenv())
  > e$list <- base::list
  > environment(form1) <- e
  > fit <- lm(form1, data=dat)
The fix may not be worthwhile because it won't help you with a formula
like y~x1+sin(x2) - 'sin' will not be found.

You could use
  environment(form1) <- parent.env(globalenv())
so all attached packages may be used but not globalenv().  Since packages
tend to contain functions and not much data this may help if you are just
trying to generate errors when there is a typo in the formula.

Knowing why you want the environment of a formula to be empty would
help answer your question.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Charles Berry
> Sent: Wednesday, March 20, 2013 7:04 PM
> To: r-help at stat.math.ethz.ch
> Subject: Re: [R] behaviour of formula objects and environment inside functions
> 
> Thomas Alexander Gerds <tag <at> biostat.ku.dk> writes:
> 
> >
> > Dear List
> >
> > I am looking for the recommended way to create a formula inside a
> > function with an empty environment. I tried several versions (see
> > below), and one of them seemed to work, but I dont understand why there
> > is a difference between .GlobalEnv and the environment inside a
> > function. I would be greatful for any reference or explanation or
> > advice.
> [snip]
> 
> From ?formula
> 
> Environments:
> 
>      A formula object has an associated environment, and this
>      environment (rather than the parent environment) is used by
>      'model.frame' to evaluate variables that are not found in the
>      supplied 'data' argument.
> 
> So write four functions that:
> 
> 1) creates a formula
> 2) creates some data
> 3) evaluates a formula using model.frame (even implicitly with lm(),say)
> 4) calls the functions from 1, 2, and 3
> 
> When you run '4', the result will depend on the environment of data from 2
> and the environment of the formula from 1. If they are both in the same
> environment, fine. If not, you might get lucky and have the data in a place
> where it will be found nevertheless.
> 
> If you are really unlucky the '4' function will find some other data that
> match the formula and use it.
> 
> HTH,
> 
> Chuck
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list