[Rd] scoping/non-standard evaluation issue

John Fox jfox at mcmaster.ca
Wed Jan 5 14:44:18 CET 2011


Dear Gabor,

I used str() to look at the two objects but missed the difference that you
found. What I didn't quite understand was why one model worked but not the
other when both were defined at the command prompt in the global
environment.

Thanks,
 John

--------------------------------
John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


> -----Original Message-----
> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
On
> Behalf Of Gabor Grothendieck
> Sent: January-04-11 6:56 PM
> To: John Fox
> Cc: Sanford Weisberg; r-devel at r-project.org
> Subject: Re: [Rd] scoping/non-standard evaluation issue
> 
> On Tue, Jan 4, 2011 at 4:35 PM, John Fox <jfox at mcmaster.ca> wrote:
> > Dear r-devel list members,
> >
> > On a couple of occasions I've encountered the issue illustrated by the
> > following examples:
> >
> > --------- snip -----------
> >
> >> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed +
> > +         Armed.Forces + Population + Year, data=longley)
> >
> >> mod.2 <- update(mod.1, . ~ . - Year + Year)
> >
> >> all.equal(mod.1, mod.2)
> > [1] TRUE
> >>
> >> f <- function(mod){
> > +     subs <- 1:10
> > +     update(mod, subset=subs)
> > +     }
> >
> >> f(mod.1)
> >
> > Call:
> > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> >    Population + Year, data = longley, subset = subs)
> >
> > Coefficients:
> >  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
> >   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
> >  Population          Year
> >   1.164e+00    -1.911e+00
> >
> >> f(mod.2)
> > Error in eval(expr, envir, enclos) : object 'subs' not found
> >
> > --------- snip -----------
> >
> > I *almost* understand what's going -- that is, clearly mod.1 and mod.2,
or
> > the formulas therein, are associated with different environments, but I
> > don't quite see why.
> >
> > Anyway, here are two "solutions" that work, but neither is in my view
> > desirable:
> >
> > --------- snip -----------
> >
> >> f1 <- function(mod){
> > +     assign(".subs", 1:10, envir=.GlobalEnv)
> > +     on.exit(remove(".subs", envir=.GlobalEnv))
> > +     update(mod, subset=.subs)
> > +     }
> >
> >> f1(mod.1)
> >
> > Call:
> > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> >    Population + Year, data = longley, subset = .subs)
> >
> > Coefficients:
> >  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
> >   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
> >  Population          Year
> >   1.164e+00    -1.911e+00
> >
> >> f1(mod.2)
> >
> > Call:
> > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> >    Population + Year, data = longley, subset = .subs)
> >
> > Coefficients:
> >  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
> >   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
> >  Population          Year
> >   1.164e+00    -1.911e+00
> >
> >> f2 <- function(mod){
> > +     env <- new.env(parent=.GlobalEnv)
> > +     attach(NULL)
> > +     on.exit(detach())
> > +     assign(".subs", 1:10, pos=2)
> > +     update(mod, subset=.subs)
> > +     }
> >
> >> f2(mod.1)
> >
> > Call:
> > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> >    Population + Year, data = longley, subset = .subs)
> >
> > Coefficients:
> >  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
> >   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
> >  Population          Year
> >   1.164e+00    -1.911e+00
> >
> >> f2(mod.2)
> >
> > Call:
> > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
> >    Population + Year, data = longley, subset = .subs)
> >
> > Coefficients:
> >  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
> >   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
> >  Population          Year
> >   1.164e+00    -1.911e+00
> >
> > --------- snip -----------
> >
> > The problem with f1() is that it will clobber a variable named .subs in
the
> > global environment; the problem with f2() is that .subs can be masked by
a
> > variable in the global environment.
> >
> > Is there a better approach?
> >
> 
> I think there is something wrong with R here since the formula in the
> call component of mod.1 has a "call" class whereas the corresponding
> call component of mod.2 has "formula" class:
> 
> > class(mod.1$call[[2]])
> [1] "call"
> > class(mod.2$call[[2]])
> [1] "formula"
> 
> If we reset call[[2]] to have "call" class then it works:
> 
> > mod.2a <- mod.2
> > mod.2a$call[[2]] <- as.call(as.list(mod.2a$call[[2]]))
> > f(mod.2a)
> 
> Call:
> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
>     Population + Year, data = longley, subset = subs)
> 
> Coefficients:
>  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
>  Population          Year
>    3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
>   1.164e+00    -1.911e+00
> 
> 
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list