[R] Model matrix with redundant columns included

Marc Schwartz marc_schwartz at comcast.net
Wed Mar 14 05:47:06 CET 2007


On Wed, 2007-03-14 at 14:57 +1100, Hong Ooi wrote:

> Hello,
> 
> Normally when you call model.matrix, you get a matrix that has
> aliased/redundant columns deleted. For example:
> 
> > m <- expand.grid(a=factor(1:3), b=factor(1:3))
> > model.matrix(~a + b, m)
>   (Intercept) a2 a3 b2 b3
> 1           1  0  0  0  0
> 2           1  1  0  0  0
> 3           1  0  1  0  0
> 4           1  0  0  1  0
> 5           1  1  0  1  0
> 6           1  0  1  1  0
> 7           1  0  0  0  1
> 8           1  1  0  0  1
> 9           1  0  1  0  1
> attr(,"assign")
> [1] 0 1 1 2 2
> attr(,"contrasts")
> attr(,"contrasts")$a
> [1] "contr.treatment"
> 
> attr(,"contrasts")$b
> [1] "contr.treatment"
> 
> The result is a matrix with 5 columns including the intercept.
> 
> However, for my purposes I need a matrix that includes all columns,
> including those that would normally be redundant. Is there any way to do
> this? For the example, this would be something like
> 
>   a1 a2 a3 b1 b2 b3
> 1  1  0  0  1  0  0
> 2  0  1  0  1  0  0
> 3  0  0  1  1  0  0
> 4  1  0  0  0  1  0
> 5  0  1  0  0  1  0
> 6  0  0  1  0  1  0
> 7  1  0  0  0  0  1
> 8  0  1  0  0  0  1
> 9  0  0  1  0  0  1
> 
> Including -1 as part of the model formula removes the intercept and adds
> the column for the base level of the first variable, but not the rest.
> 
> Thanks,


There may be a better way, but this seems to work:

> m
  a b
1 1 1
2 2 1
3 3 1
4 1 2
5 2 2
6 3 2
7 1 3
8 2 3
9 3 3

MAT <- do.call("cbind", lapply(m, function(x) model.matrix(~ x - 1)))

> MAT
  x1 x2 x3 x1 x2 x3
1  1  0  0  1  0  0
2  0  1  0  1  0  0
3  0  0  1  1  0  0
4  1  0  0  0  1  0
5  0  1  0  0  1  0
6  0  0  1  0  1  0
7  1  0  0  0  0  1
8  0  1  0  0  0  1
9  0  0  1  0  0  1

colnames(MAT) <- names(unlist(lapply(m, levels)))
 
> MAT
  a1 a2 a3 b1 b2 b3
1  1  0  0  1  0  0
2  0  1  0  1  0  0
3  0  0  1  1  0  0
4  1  0  0  0  1  0
5  0  1  0  0  1  0
6  0  0  1  0  1  0
7  1  0  0  0  0  1
8  0  1  0  0  0  1
9  0  0  1  0  0  1


You can cbind() the (Intercept) column back in if you require.

HTH,

Marc Schwartz



More information about the R-help mailing list