[Rd] default data= arg to glm() (PR#844)

Thomas Lumley tlumley@u.washington.edu
Fri, 9 Feb 2001 14:57:19 -0800 (PST)

On Fri, 9 Feb 2001 pperkins@ucsd.edu wrote:
> the lines
>    if (missing(data)) 
>       data <- environment(formula)
> in glm() seem to contradict the documentaton:
>    data: an optional data frame containing the variables in the model.
>          By default the variables are taken from the environment from
>          which `glm' is called.
> actually, the near lack of other references to "data" is glm() is not
> clear to me, so i may have this wrong.  but a small test seems to bear
> out the problem:

The documentation is slightly out of date.  The current behaviour is that
the variables are taken from the environment in which the model formula is
defined. This will usually, but not always, be the same thing. This
behaviour is new in 1.2.0, and is designed to give functions like
predict() and update() a fighting chance.

However, the line you quote is in fact irrelevant to this. The
variable `data' that it modifies is not used in fitting the model, it is
just returned as part of the result. So, you ask, if the `data' variable
is never used, how is it used?

This is Deep Magic, and probably worth explaining.
A copy of the call to glm() (or other modelling functions) is grabbed by
match.call() and turned into a call to model.frame(). This call is then
evaluated in the parent environment to create a model frame. The
model.matrix() function is then called to create a design matrix from the
model frame.

This is how we fake dynamic scope -- using the values of variables in the
parent environment --  in a language that really has static scope. 

The moral of this is that it pays to put the variables you want to use in
a data frame. It's much easier to find them that way than by playing
clever games with environments.


Thomas Lumley			Asst. Professor, Biostatistics
tlumley@u.washington.edu	University of Washington, Seattle

r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch