[Rd] scoping/non-standard evaluation issue

peter dalgaard pdalgd at gmail.com
Wed Jan 5 16:50:56 CET 2011


On Jan 5, 2011, at 14:44 , John Fox wrote:

> Dear Gabor,
> 
> I used str() to look at the two objects but missed the difference that you
> found. What I didn't quite understand was why one model worked but not the
> other when both were defined at the command prompt in the global
> environment.

I kind of suspect that the bug is that mod.1 works... I.e., I can vaguely make out the  contours of why mod.2 is not supposed to work and if that is true, neither should mod.1. However, if so, something clearly needs more work. Possibly, some of the people who worked on implement formula environments may want to chime in? (It's been a while, though.)

> 
> Thanks,
> John
> 
> --------------------------------
> John Fox
> Senator William McMaster 
>  Professor of Social Statistics
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
> 
> 
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
> On
>> Behalf Of Gabor Grothendieck
>> Sent: January-04-11 6:56 PM
>> To: John Fox
>> Cc: Sanford Weisberg; r-devel at r-project.org
>> Subject: Re: [Rd] scoping/non-standard evaluation issue
>> 
>> On Tue, Jan 4, 2011 at 4:35 PM, John Fox <jfox at mcmaster.ca> wrote:
>>> Dear r-devel list members,
>>> 
>>> On a couple of occasions I've encountered the issue illustrated by the
>>> following examples:
>>> 
>>> --------- snip -----------
>>> 
>>>> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed +
>>> +         Armed.Forces + Population + Year, data=longley)
>>> 
>>>> mod.2 <- update(mod.1, . ~ . - Year + Year)
>>> 
>>>> all.equal(mod.1, mod.2)
>>> [1] TRUE
>>>> 
>>>> f <- function(mod){
>>> +     subs <- 1:10
>>> +     update(mod, subset=subs)
>>> +     }
>>> 
>>>> f(mod.1)
>>> 
>>> Call:
>>> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
>>>    Population + Year, data = longley, subset = subs)
>>> 
>>> Coefficients:
>>>  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
>>>   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
>>>  Population          Year
>>>   1.164e+00    -1.911e+00
>>> 
>>>> f(mod.2)
>>> Error in eval(expr, envir, enclos) : object 'subs' not found
>>> 
>>> --------- snip -----------
>>> 
>>> I *almost* understand what's going -- that is, clearly mod.1 and mod.2,
> or
>>> the formulas therein, are associated with different environments, but I
>>> don't quite see why.
>>> 
>>> Anyway, here are two "solutions" that work, but neither is in my view
>>> desirable:
>>> 
>>> --------- snip -----------
>>> 
>>>> f1 <- function(mod){
>>> +     assign(".subs", 1:10, envir=.GlobalEnv)
>>> +     on.exit(remove(".subs", envir=.GlobalEnv))
>>> +     update(mod, subset=.subs)
>>> +     }
>>> 
>>>> f1(mod.1)
>>> 
>>> Call:
>>> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
>>>    Population + Year, data = longley, subset = .subs)
>>> 
>>> Coefficients:
>>>  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
>>>   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
>>>  Population          Year
>>>   1.164e+00    -1.911e+00
>>> 
>>>> f1(mod.2)
>>> 
>>> Call:
>>> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
>>>    Population + Year, data = longley, subset = .subs)
>>> 
>>> Coefficients:
>>>  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
>>>   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
>>>  Population          Year
>>>   1.164e+00    -1.911e+00
>>> 
>>>> f2 <- function(mod){
>>> +     env <- new.env(parent=.GlobalEnv)
>>> +     attach(NULL)
>>> +     on.exit(detach())
>>> +     assign(".subs", 1:10, pos=2)
>>> +     update(mod, subset=.subs)
>>> +     }
>>> 
>>>> f2(mod.1)
>>> 
>>> Call:
>>> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
>>>    Population + Year, data = longley, subset = .subs)
>>> 
>>> Coefficients:
>>>  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
>>>   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
>>>  Population          Year
>>>   1.164e+00    -1.911e+00
>>> 
>>>> f2(mod.2)
>>> 
>>> Call:
>>> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
>>>    Population + Year, data = longley, subset = .subs)
>>> 
>>> Coefficients:
>>>  (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
>>>   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
>>>  Population          Year
>>>   1.164e+00    -1.911e+00
>>> 
>>> --------- snip -----------
>>> 
>>> The problem with f1() is that it will clobber a variable named .subs in
> the
>>> global environment; the problem with f2() is that .subs can be masked by
> a
>>> variable in the global environment.
>>> 
>>> Is there a better approach?
>>> 
>> 
>> I think there is something wrong with R here since the formula in the
>> call component of mod.1 has a "call" class whereas the corresponding
>> call component of mod.2 has "formula" class:
>> 
>>> class(mod.1$call[[2]])
>> [1] "call"
>>> class(mod.2$call[[2]])
>> [1] "formula"
>> 
>> If we reset call[[2]] to have "call" class then it works:
>> 
>>> mod.2a <- mod.2
>>> mod.2a$call[[2]] <- as.call(as.list(mod.2a$call[[2]]))
>>> f(mod.2a)
>> 
>> Call:
>> lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces +
>>    Population + Year, data = longley, subset = subs)
>> 
>> Coefficients:
>> (Intercept)  GNP.deflator           GNP    Unemployed  Armed.Forces
>> Population          Year
>>   3.641e+03     8.394e-03     6.909e-02    -3.971e-03    -8.595e-03
>>  1.164e+00    -1.911e+00
>> 
>> 
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list