[R] stumped by eval

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Wed Feb 13 11:18:22 CET 2008


Berwin A Turlach wrote:
> G'day Peter,
>
> On Wed, 13 Feb 2008 08:03:07 +0100
> Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:
>
>   
>> Ross Boylan wrote:
>>     
>>> In the following example, the inner evaluation pulls in the global
>>> value of subset (a function) rather than the one I thought I was
>>> passing in (a vector).  Can anyone help me understand what's going
>>> on, and what I need to do to fix the problem?
>>>       
>
> [...]
>
>   
>> The point is that subset (and offset) arguments are subject to the
>> same evaluation rules as the terms inside the formula: First look in
>> "data", then in the environment of the formula, which in this case is
>> the global environment.
>>     
>
> Perhaps I have a [senior|blonde]-day today, but this does not seem to
> be the full explanation about what is going on to me.  According to this
> explanation the following should not work:
>
>   
>> lm(Reading~0+Spec+Reader, netto, subset=c(1) )
>>     
>
> Call:
> lm(formula = Reading ~ 0 + Spec + Reader, data = netto, subset = c(1))
>
> Coefficients:
>   Spec  Reader  
>      1      NA  
>
> since the value passed to subset is not part of "data" and not in the
> global environment. But, obviously, it works. 

It is, however, an expression that can be evaluated in the global
environment, and that works.

>  OTOH, if we change f0 to
>
>   
>> f0
>>     
> function(formula, data, subset, na.action)
> {
>   lm(formula, data, subset=subset, na.action=na.action)
> }
>
> then we get the same behaviour as with Ross's use of f1 inside of f0:
>
>   
>> t3 <- f0(Reading~0+Spec+Reader, netto, c(1) )
>>     
> Error in xj[i] : invalid subscript type 'closure'
>
>   
I told you it was elusive... The thing is that lm() is using nonstandard
evaluation, so it sees the _symbol_ subset --- er, wait a minute, too
many "subset"s, let us define it like this instead

f0 <- function(formula, data, s, na.action)
{
  lm(formula, data, subset=s, na.action=na.action)
}

Ok, lm() sees the _symbol_ s passed as subset and then looks for it in "data" and then in environment(formula). It never gets the idea to look for "s" in the evaluation frame of f0.

One workaround is this:
> f0
function(formula, data, s, na.action)
{
  eval(bquote(lm(formula, data, subset=.(s), na.action=na.action)))
}

Another, I think better, way is

> f0
function(formula, data, s, na.action)
{
  eval(bquote(lm(formula, data, subset=.(substitute(s)), na.action=na.action)))
}


The latter also allows

> f0(Reading~0+Spec+Reader, netto, Spec>0 )

Call:
lm(formula = formula, data = data, subset = Spec > 0, na.action = na.action)

Coefficients:
   Spec   Reader
-1.2976   0.7934






> More over, with the original definition of f0:
>   
>> f0
>>     
> function(formula, data, subset, na.action)
> {
>   f1(formula, data, subset, na.action)
> }
>   
>> (f1(Reading~0+Spec+Reader, netto, subset= Spec==1 ))
>>     
>   Reading Spec Reader
> 1       1    1      1
>   
>> f0(Reading~0+Spec+Reader, netto, subset= Spec==1 )
>>     
> Error in xj[i] : invalid subscript type 'closure'
>
> Given your explanation, I would have expected this to work.
>   
I think the issue here is still that model.matrix ends up being called
with subset=subset rather than subset= Spec==1.
> Reading up on `subset' in ?model.frame also does not seem to shed light
> onto what is going on.
>
> Remaining confused.....
>
> Cheers,
>
> 	Berwin
>
>   


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list