[R] lm looking for weights outside of the user-defined function

William Dunlap wdunlap at tibco.com
Fri Oct 22 21:17:29 CEST 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius
> Sent: Friday, October 22, 2010 6:25 AM
> To: Dimitri Liakhovitski
> Cc: r-help
> Subject: Re: [R] lm looking for weights outside of the 
> user-defined function
> 
> 
> On Oct 22, 2010, at 9:18 AM, Dimitri Liakhovitski wrote:
> 
> > David,
> > I undersand - and I am sure what you are suggesting should 
> work. But I
> > just can't understand why it's not grabbing things INSIDE the
> > environment of the formula first.
> 
> I am not sure that either one of us understand what is meant by "the  
> environment of the formula".

"The environment of the formula" is the output of
   environment(formula)
which is assigned to the current environment when the
formula is created.  The modelling functions look for
variables (in the formula, weights, and subset arguments)
in the order
   1) the data argument (usually an environment or a list)
   2) environment of the formula
When an environment is searched for a name, the search
continues through all ancestral environments until the
name is found or until you run out of ancestors.

You can reassign the environment of a formula.  E.g.,
compare the following two:

  > wr0 <- function(formula, MyData, WeightsVector) {
  +     lm(formula, data=MyData, weights=WeightsVector)
  + }
  > wr1 <- function(formula, MyData, WeightsVector) {
  +     environment(formula) <- environment()
  +     lm(formula, data=MyData, weights=WeightsVector)
  + }
  > wr0(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32))
  Error in eval(expr, envir, enclos) : object 'WeightsVector' not found
  > wr1(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32))

  Call:
  lm(formula = formula, data = MyData, weights = WeightsVector)

  Coefficients:
  (Intercept)          cyl
       38.567       -2.966

Reassigning the environment can lead to the sort of
surprises that dynamic scoping gives you.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> 
> > I've already tried to define the weights outside of the 
> function - and
> > it finds them.
> >
> > But shouldn't it go in this order?
> > 1. Look in the data frame
> > 2. Look in the environment of the user-defined function
> > 3. Look outside.
> 
> Hey, I only work here, I don't make the rules, I just follow them. I  
> agree that one might guess that to be the search order, but 
> it is not  
> what is documented.
> 
> -- 
> David Winsemius, MD
> West Hartford, CT
> 
> >
> > Dimitri
> >
> > On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius 
> <dwinsemius at comcast.net 
> > > wrote:
> >>
> >> On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote:
> >>
> >>> Dear R'ers,
> >>>
> >>> I am fighting with a problem that is driving me crazy. I 
> use "lm" in
> >>> my user-defined function, but it seems to be looking for weights
> >>> outside of my function's environment:
> >>>
> >>> ### Generating example data:
> >>> x<-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1))
> >>> myweights<-runif(100)
> >>> data.for.regression<-x[1:3]
> >>>
> >>> ### Creating function "weighted.reg":
> >>> weighted.reg=function(formula, MyData, filename,WeightsVector)
> >>> {
> >>>        print(dim(MyData))
> >>>        print(filename)
> >>>        print(length(WeightsVector))
> >>>        regr.f<- 
> >>> lm(formula,MyData,weights=WeightsVector,na.action=na.omit)
> >>>        results<-as.data.frame(round(summary(regr.f)$coeff,3))
> >>>        write.csv(results,file=filename)
> >>>        return(results)
> >>> }
> >>>
> >>> ### Running "weighted.reg" with my data:
> >>> reg2<-weighted.reg(y~., MyData=x, WeightsVector=myweights,
> >>> filename="TEST.csv")
> >>>
> >>>
> >>> I get an error: Error in eval(expr, envir, enclos) : object
> >>> 'WeightsVector' not found
> >>> Notice, that the function correctly prints 
> length(WeightsVector).  
> >>> But
> >>> it looks like "lm" is looking for weights (in the 4th line of the
> >>> function) OUTSIDE the function and does not see WeightsVector.
> >>
> >> Have you tried putting WeightsVector in the "x" dataframe? That  
> >> would seem
> >> to reduce the potential for environmental conflation.
> >>
> >> From the details section of help(lm):
> >> "All of weights, subset and offset are evaluated in the same way as
> >> variables in formula, that is first in data and then in the  
> >> environment of
> >> formula."
> >>
> >>
> >>> Why is it looking outside the function for the object 
> that has just
> >>> been defined inside the function?
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list