R-alpha: model.matrix misbehaves on two-sided formulas

Thomas Lumley thomas@biostat.washington.edu
Wed, 10 Sep 1997 15:15:46 -0700 (PDT)


On 10 Sep 1997, Douglas Bates wrote:

> I am using R for a graduate class on linear models that I teach.  I
> sent the class a sample transcript of explicit construction of the
> model matrix and use of the QR decomposition.  One of the members of
> the class picked up on the fact that the model matrix was not what I
> advertised that it should be when a two-sided formula is used.  A
> one-sided formula works properly.  For example,
>  > library(st849)          # loads data sets from the text
>  > e1.1
>        density speed
>   [1,]    20.4  38.8
>  > model.matrix(~density, data = e1.1)  # what we would expect
>        (Intercept) density
>   [1,]           1    20.4
>   [2,]           1    27.4
>  > model.matrix(sqrt(speed) ~ density, data = e1.1) # not what we want
>        (Intercept) density
>   [1,]           1    38.8
>   [2,]           1    31.5
> 
> I think that somewhere along the line an expression like
>  eval(form[[2]], data)
> is being used when it would be better to use
>  eval(form[[length(form)]], data)

No, it's worse than that.  For versions of R up to 0.50-a4, the new alpha
release, model.matrix(formula,dataframe) assumes that its data argument
was generated by
	dataframe<-model.frame(formula,data)
It does *not* check the names or order of the columns. This means that
model.matrix(formula, some.data.frame) will usually give the wrong answer.

In R0.50-a4 this is fixed.


Thomas Lumley
------------------------------------------------------+------
Biostatistics		: "Never attribute to malice what  :
Uni of Washington	:  can be adequately explained by  :
Box 357232		:  incompetence" - Hanlon's Razor  :
Seattle WA 98195-7232	:				   :
------------------------------------------------------------

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-