# R-alpha: model.matrix misbehaves on two-sided formulas

**Thomas Lumley
**
thomas@biostat.washington.edu

*Wed, 10 Sep 1997 15:15:46 -0700 (PDT)*

On 10 Sep 1997, Douglas Bates wrote:
>* I am using R for a graduate class on linear models that I teach. I
*>* sent the class a sample transcript of explicit construction of the
*>* model matrix and use of the QR decomposition. One of the members of
*>* the class picked up on the fact that the model matrix was not what I
*>* advertised that it should be when a two-sided formula is used. A
*>* one-sided formula works properly. For example,
*>* > library(st849) # loads data sets from the text
*>* > e1.1
*>* density speed
*>* [1,] 20.4 38.8
*>* > model.matrix(~density, data = e1.1) # what we would expect
*>* (Intercept) density
*>* [1,] 1 20.4
*>* [2,] 1 27.4
*>* > model.matrix(sqrt(speed) ~ density, data = e1.1) # not what we want
*>* (Intercept) density
*>* [1,] 1 38.8
*>* [2,] 1 31.5
*>*
*>* I think that somewhere along the line an expression like
*>* eval(form[[2]], data)
*>* is being used when it would be better to use
*>* eval(form[[length(form)]], data)
*
No, it's worse than that. For versions of R up to 0.50-a4, the new alpha
release, model.matrix(formula,dataframe) assumes that its data argument
was generated by
dataframe<-model.frame(formula,data)
It does *not* check the names or order of the columns. This means that
model.matrix(formula, some.data.frame) will usually give the wrong answer.
In R0.50-a4 this is fixed.
Thomas Lumley
------------------------------------------------------+------
Biostatistics : "Never attribute to malice what :
Uni of Washington : can be adequately explained by :
Box 357232 : incompetence" - Hanlon's Razor :
Seattle WA 98195-7232 : :
------------------------------------------------------------
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-