[Rd] update.default: fall back on model.frame in case that the data frame is not in the parent environment

Tue Aug 2 16:48:12 CEST 2011

> mm <- function(datf) {
>    lm(y ~ x, data = datf)
> }
> mydatf <- data.frame(x = rep(1:2, 10), y = rnorm(20, rep(1:2, 10)), z
=
> rnorm(20))
> 
> l <- mm(mydatf)
> update(l, . ~ . + z)   # This fails, z is not found

Good point. So let me rephrase the initial problem:

1.) An lm object is fitted somewhere with some data, which resides
somewhere in the memory.
2.) An ideal update function would know where the original data is
(rather than assuming that it is stored 
  a.) in the parent frame
  b.) under the name given in the call slot of the lm object)

While from my point of view assumption a.) seems to be reasonable,
assumption b.) is kind of awkward as pointed out, because it makes it
kind of cumbersome to update models, which were created inside a
function (which should not be a too rare use case).

Thus, I've to questions:
1.) Is it somehow possible to retrieve the original data.frame with
which an lm is fitted just from the knowledge of the fit? I fear that
model.frame is the best I have. 
2.) Is there any other way of making update aware of where to look for
the model building data?

By the way, another work-around I was just thinking of is to use

mm <- function(datf) {
   l <- lm(y ~ x, data = datf)
   call <- l$call 
   call$data <- substitute(datf)
   l$call <- call
   l   
}

which solves my issue (and with which I can very well live with), but I
was wondering whether you see any chance that update could be made
smarter? Thanks for your input.

KR,

-Thorn