[Rd] lm() takes weights from formula environment

John Mount jmount @end|ng |rom w|n-vector@com
Mon Aug 10 20:54:58 CEST 2020


Forgot the url: https://win-vector.com/2014/05/30/trimming-the-fat-from-glm-models-in-r/

On Aug 10, 2020, at 11:50 AM, John Mount <jmount using win-vector.com<mailto:jmount using win-vector.com>> wrote:

Thank you for your suggestion. I do know how to work around the issue.  I usually build a fresh environment as a child of base-environment and then insurt the weights there. I was just trying to provide an example of the issue.

emptyenv() can not be used, as it is needed for the eval (errors out even if weights are not used with "could not find function list").

For some applications one doesn't want the formula to have a non-trivial environment with respect to serialization.  Nina Zumel wrote about reference leaks in lm()/glm() and a good part of that was environments other than global/base (such as those formed when building a formula in a function) capturing references to unrelated structures.



On Aug 10, 2020, at 11:34 AM, Duncan Murdoch <murdoch.duncan using gmail.com<mailto:murdoch.duncan using gmail.com>> wrote:

On 10/08/2020 1:42 p.m., John Mount wrote:
I wish I had started with "I am disappointed that lm() doesn't continue its search for weights into the calling environment" or "the fact that lm() looks only in the formula environment and data frame for weights doesn't seem consistent with how other values are treated."

Normally searching is done automatically by following a chain of environments.  It's easy to add something to the head of the chain (e.g. data), it's hard to add something in the middle or at the end (because the chain ends with emptyenv(), which is not allowed to have a parent).

So I'd suggest using

environment(f) <- environment()

before calling lm() if you want the calling environment to be in the search.  Setting it to baseenv() doesn't really make sense, unless you want to disable all searches except in data, in which case emptyenv() would make more sense (but I haven't tried it, so it might break something).

Duncan Murdoch

But I did not. So I do apologize for both that and for negative tone on my part.
Simplified example:
d <- data.frame(x = 1:3, y = c(1, 2, 1))
w <- c(1, 10, 1)
f <- as.formula(y ~ x)
lm(f, data = d, weights = w)  # works
# fails
environment(f) <- baseenv()
lm(f, data = d, weights = w)
# Error in eval(extras, data, env) : object 'w' not found
On Aug 9, 2020, at 11:56 AM, Duncan Murdoch <murdoch.duncan using gmail.com<mailto:murdoch.duncan using gmail.com>> wrote:

This is fairly clearly documented in ?lm:





	[[alternative HTML version deleted]]



More information about the R-devel mailing list