[Rd] how to control the environment of a formula

Fri Apr 19 19:27:12 CEST 2013

Duncan,
 I stand by all my comments.  Well behaved function -- those that look
only at their input arguments -- do just fine with a simple env.
 Now as to formulas --- the part of R that has most aggressively messed
with normal evaluation rules.  It is quite possible that there is/was no
other way to implement their functionality set, so I'm not throwing rocks
at that.  However, as soon as they enter the scene the consequences
multiply like rabbits and I feel like I've fallen into a hall of mirrors.
Nothing else has caused me as much ongoing confusion and wonderment in the
survival package.
  As soon as you introduced them all my arguments are irrelevant.

Terry T

On 4/19/13 9:05 AM, "Duncan Murdoch" <murdoch.duncan at gmail.com> wrote:

>On 13-04-19 8:41 AM, Therneau, Terry M., Ph.D. wrote:
>>   I went through the same problem and discovery process 2 years ago
>>with the survival package.  With pspline()  terms the return object from
>>coxph includes a simple 6 line function for enhanced printout, which by
>>default carried along another 30 irrelevant things some of which were
>>huge.
>> I personally think that setting environment(f) <- .Globalenv is the
>>clearest and most simple solution.
>> Note that R does not save the environment of functions defined at the
>>top level; the prior line says to treat your function as "one of those".
>> It works very well as long as your function is an actual function,
>>i.e. It depends only on its input arguments.
>>
>> \begin {opinion}
>>    S started out as a pure functional language.  That is, a function
>>depends ONLY on its arguments.   Many of the strengths of S/R flow
>>directly from the simplicity and rigor that this gives.
>> There is an adage in programming, going back to at least the earliest
>>Fortran compilers,  that all successful languages have a way to break
>>their own rules;  and S indeed had some hidden workarounds.  Formalizing
>>these non-functional back doors as R has done with environments is a
>>good thing.
>>
>> However, the back doors should be used only with extreme reluctance.  I
>>cringe at each new "how to be sneaky" discussion on the mailing lists.
>>The 'solution' is rarely worth the long term price.
>>   \end{opinion}
>
>Hmmm, it seems to me that your first paragraph contradicts your opinion.
>  If you set the environment of a formula to .GlobalEnv then suddenly
>the way that formula acts depends on all sorts of things that weren't
>there when it was created.
>
>Attaching the formula at the time of creation of a formula means that
>the names within it refer to data that is currently in scope.  That's
>generally a good thing.  It means that code will act the same when you
>run it at the top level or in a function.
>
>For example, consider this:
>
>f <- function() {
>    x <- 1:10
>    x2 <- x^2
>    y <- rnorm(10, mean=x2)
>    formula <- y ~ x + x2
>    formula
>}
>
>fit <- lm(f())
>update(fit, . ~ . - x)
>
>
>This code works fine, all because the formula keeps the environment
>where it was created.  If I modify it like this:
>
>f <- function() {
>    x <- 1:10
>    x2 <- x^2
>    y <- rnorm(10, mean=x2)
>    formula <- y ~ x + x2
>    environment(formula) <- .GlobalEnv
>    formula
>}
>
>fit <- lm(f())
>update(fit, . ~ . - x)
>
>
>then I really have no idea what it will produce, because it depends on
>global variables y, x and x2, not the local ones created in the
>function.  If I'm lucky, I'll get an "object not found" error; if I'm
>not lucky, it'll just go find some other variables and use those.
>
>Duncan Murdoch