[R] lm looking for weights outside of the user-defined function

William Dunlap wdunlap at tibco.com
Sat Oct 23 19:17:00 CEST 2010


> From: David Winsemius [mailto:dwinsemius at comcast.net] 
> Sent: Friday, October 22, 2010 9:43 PM
> To: William Dunlap
> Cc: Dimitri Liakhovitski; r-help
> Subject: Re: [R] lm looking for weights outside of the 
> user-defined function
> 
> 
> On Oct 22, 2010, at 12:17 PM, William Dunlap wrote:
...
> > "The environment of the formula" is the output of
> >   environment(formula)
> > which is assigned to the current environment when the
> > formula is created.  The modelling functions look for
> > variables (in the formula, weights, and subset arguments)
> > in the order
> >   1) the data argument (usually an environment or a list)
> >   2) environment of the formula
> > When an environment is searched for a name, the search
> > continues through all ancestral environments until the
> > name is found or until you run out of ancestors.
> >
> > You can reassign the environment of a formula.  E.g.,
> > compare the following two:
> >
> >> wr0 <- function(formula, MyData, WeightsVector) {
> >  +     lm(formula, data=MyData, weights=WeightsVector)
> >  + }
> >> wr1 <- function(formula, MyData, WeightsVector) {
> >  +     environment(formula) <- environment()
> >  +     lm(formula, data=MyData, weights=WeightsVector)
> >  + }
> >> wr0(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32))
> >  Error in eval(expr, envir, enclos) : object 
> 'WeightsVector' not found
> 
>   The wr0 call created a formula but the weights vector was 
> "outside"  
> that environment? And that wss because the formula creation 
> was at the  
> stage of evaluation the function arguments when tehy wouldn't "see"  
> each other?  Except this works:
> 
>  > xtoy <- function(x = 1:2, y=x){y}
>  > xtoy()
> [1] 1 2
> 
> I'm trying to figure out what makes the wr0 version fail and that  
> xtoy() function succeed.

The basic reason is that lm() calls eval() to evaluate
things in the formula, weights, and subset arguments
in a nonstandard (but well defined) way and xtoy() uses
the standard argument evaluation rules.
 
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
 
> 
> -- 
> David.
> >> wr1(mpg~cyl, MyData=mtcars, WeightsVector=sqrt(1:32))
> >
> >  Call:
> >  lm(formula = formula, data = MyData, weights = WeightsVector)
> >
> >  Coefficients:
> >  (Intercept)          cyl
> >       38.567       -2.966
> >
> > Reassigning the environment can lead to the sort of
> > surprises that dynamic scoping gives you.
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >>
> >>> I've already tried to define the weights outside of the
> >> function - and
> >>> it finds them.
> >>>
> >>> But shouldn't it go in this order?
> >>> 1. Look in the data frame
> >>> 2. Look in the environment of the user-defined function
> >>> 3. Look outside.
> >>
> >> Hey, I only work here, I don't make the rules, I just 
> follow them. I
> >> agree that one might guess that to be the search order, but
> >> it is not
> >> what is documented.
> >>
> >> -- 
> >> David Winsemius, MD
> >> West Hartford, CT
> >>
> >>>
> >>> Dimitri
> >>>
> >>> On Fri, Oct 22, 2010 at 9:15 AM, David Winsemius
> >> <dwinsemius at comcast.net
> >>>> wrote:
> >>>>
> >>>> On Oct 22, 2010, at 9:01 AM, Dimitri Liakhovitski wrote:
> >>>>
> >>>>> Dear R'ers,
> >>>>>
> >>>>> I am fighting with a problem that is driving me crazy. I
> >> use "lm" in
> >>>>> my user-defined function, but it seems to be looking for weights
> >>>>> outside of my function's environment:
> >>>>>
> >>>>> ### Generating example data:
> >>>>> 
> x<-data.frame(y=rnorm(100,0,1),a=rnorm(100,1,1),b=rnorm(100,2,1))
> >>>>> myweights<-runif(100)
> >>>>> data.for.regression<-x[1:3]
> >>>>>
> >>>>> ### Creating function "weighted.reg":
> >>>>> weighted.reg=function(formula, MyData, filename,WeightsVector)
> >>>>> {
> >>>>>       print(dim(MyData))
> >>>>>       print(filename)
> >>>>>       print(length(WeightsVector))
> >>>>>       regr.f<-
> >>>>> lm(formula,MyData,weights=WeightsVector,na.action=na.omit)
> >>>>>       results<-as.data.frame(round(summary(regr.f)$coeff,3))
> >>>>>       write.csv(results,file=filename)
> >>>>>       return(results)
> >>>>> }
> >>>>>
> >>>>> ### Running "weighted.reg" with my data:
> >>>>> reg2<-weighted.reg(y~., MyData=x, WeightsVector=myweights,
> >>>>> filename="TEST.csv")
> >>>>>
> >>>>>
> >>>>> I get an error: Error in eval(expr, envir, enclos) : object
> >>>>> 'WeightsVector' not found
> >>>>> Notice, that the function correctly prints
> >> length(WeightsVector).
> >>>>> But
> >>>>> it looks like "lm" is looking for weights (in the 4th 
> line of the
> >>>>> function) OUTSIDE the function and does not see WeightsVector.
> >>>>
> >>>> Have you tried putting WeightsVector in the "x" dataframe? That
> >>>> would seem
> >>>> to reduce the potential for environmental conflation.
> >>>>
> >>>> From the details section of help(lm):
> >>>> "All of weights, subset and offset are evaluated in the 
> same way as
> >>>> variables in formula, that is first in data and then in the
> >>>> environment of
> >>>> formula."
> >>>>
> >>>>
> >>>>> Why is it looking outside the function for the object
> >> that has just
> >>>>> been defined inside the function?
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> 
> 



More information about the R-help mailing list