[R] Issue with subset in glm

Bert Gunter bgunter.4567 at gmail.com
Wed Mar 22 15:38:47 CET 2017


The subset argument is evaluated in "data" first, then in the caller's
environment, etc.
So:

1) In your first example, stype is a *vector*, and the subset
expression is identically TRUE, hence is equivalent to making the call
without the subset argument.

2) The second call fits the subset with stype = "E", hence is different.

3) "i" is not found in mtcars, hence is looked for in the caller,
where it has the value 4, giving the same subset and result as in the
next call.


Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 21, 2017 at 8:50 AM, Ganz, Carl <carlganz at ucla.edu> wrote:
> Hello,
>
> I am experiencing odd behavior with the subset parameter for glm. It appears that the parameter uses non-standard evaluation, but only in some cases. Below is a reproducible example.
>
> library(survey) # for example dataset
>
> data(api)
> stype <- "E"
> (a <- glm(api00~ell+meals+mobility, data = apistrat,
>     subset = apistrat$stype == stype))
> (b <- glm(api00~ell+meals+mobility, data = apistrat,
>     subset = apistrat$stype == "E"))
> # should be equal since stype = "E" but they aren't
> coef(a)==coef(b)
>
> # for some reason works as expected here
> i = 4
> (c <- glm(mpg ~ wt, data = mtcars, subset = mtcars$cyl==i))
> (d <- glm(mpg ~ wt, data = mtcars, subset = mtcars$cyl==4))
> coef(c)==coef(d)
>
> I can't really explain what is happening so I would appreciate help.
>
> Kind Regards,
> Carl Ganz
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list