[Rd] scoping/non-standard evaluation issue
    John Fox 
    jfox at mcmaster.ca
       
    Tue Jan  4 22:35:35 CET 2011
    
    
  
Dear r-devel list members,
On a couple of occasions I've encountered the issue illustrated by the
following examples:
--------- snip -----------
> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed + 
+         Armed.Forces + Population + Year, data=longley)
> mod.2 <- update(mod.1, . ~ . - Year + Year)
> all.equal(mod.1, mod.2)
[1] TRUE
> 
> f <- function(mod){
+     subs <- 1:10
+     update(mod, subset=subs)
+     }
    
> f(mod.1)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = subs)
Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00  
> f(mod.2)
Error in eval(expr, envir, enclos) : object 'subs' not found
--------- snip -----------
I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or
the formulas therein, are associated with different environments, but I
don't quite see why.
Anyway, here are two "solutions" that work, but neither is in my view
desirable:
--------- snip -----------
> f1 <- function(mod){
+     assign(".subs", 1:10, envir=.GlobalEnv)
+     on.exit(remove(".subs", envir=.GlobalEnv))
+     update(mod, subset=.subs)
+     }
> f1(mod.1)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)
Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00  
> f1(mod.2)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)
Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00  
> f2 <- function(mod){
+     env <- new.env(parent=.GlobalEnv)
+     attach(NULL)
+     on.exit(detach())
+     assign(".subs", 1:10, pos=2)
+     update(mod, subset=.subs)
+     }
> f2(mod.1)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)
Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00  
> f2(mod.2)
Call:
lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + 
    Population + Year, data = longley, subset = .subs)
Coefficients:
 (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces  
   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03  
  Population          Year  
   1.164e+00    -1.911e+00  
--------- snip -----------
The problem with f1() is that it will clobber a variable named .subs in the
global environment; the problem with f2() is that .subs can be masked by a
variable in the global environment.
Is there a better approach?
Thanks,
 John
--------------------------------
John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
    
    
More information about the R-devel
mailing list