[R] Order of formula terms in model.matrix
Charles C. Berry
ccberry at ucsd.edu
Sun Jan 17 19:34:40 CET 2016
On Sun, 17 Jan 2016, Lars Bishop wrote:
> I’d appreciate your help on understanding the following.
> It is not very clear to me from the model.matrix documentation, why
> simply changing the order of terms in the formula may change the number
> of resulting columns. Please note I’m purposely not including main
> effects in the model formula in this case.
IIRC, there are some heuristics involved harking back to the White Book. I
recall there have been discussions of whether and how this could be fixed
before on this list and or R-devel, but I cannot seem to lay my browser on
them right now.
> x1 <- rnorm(100)
> f1 <- factor(sample(letters[1:3], 100, replace = TRUE))
> trt <- sample(c(-1,1), 100, replace = TRUE)
> df <- data.frame(x1=x1, f1=f1, trt=trt)
> dim(model.matrix( ~ x1:trt + f1:trt, data = df))
>  100 4
> dim(model.matrix(~ f1:trt + x1:trt, data = df))
>  100 5
By `x1:trt' I guess you mean the same thing as `I(x1*trt)'.
If you use the latter form, the issue you raise goes away.
Note that `I(some.expr)' gives you the ability to force the behavior of
model.matrix to be exactly what you want by suitably crafting `some.expr',
More information about the R-help