[R] somewhat ineffective suppressing intercepts

Ross Boylan ross at biostat.ucsf.edu
Fri Sep 27 05:36:26 CEST 2013


Suppressing the intercept and contr.sum coding are not quite working as
I expect:
> mf <- data.frame(A=C(factor(c("a", "b", "c")), contr.sum))
> mm <- model.matrix(~0+A, data=mf)
> mm
  Aa Ab Ac
1  1  0  0
2  0  1  0
3  0  0  1

What I expect (and want) is
   A1  A2
1  1    0
2  0    1
3  1    1

When I do more complicated models every term except the first one is
coded as expected.  That includes A itself if interacted with other
variables.

It seems R has decided the model really needs an intercept and is
throwing in an extra level for the first factor to assure that I get it,
even though I said with the "0" that I didn't want it.

BTW, ~A produces an intercept and the two columns expected above.  But I
don't want the intercept; the model matrix is going into a multinomial
model for which the intercept is not identified (since all intercepts
produce the same predicted probabilities).

What's going on here?

R 2.15.1


P.S. I think the above stripped down example illustrates the problem,
but here's a more expanded model:

> mf <- expand.grid(C(factor(c("a", "b", "c")), contr.sum),
+                   C(factor(c("f", "t")), contr.sum))
> colnames(mf) <- c("A", "H")
> mf$x <- seq(6)
> mf
  A H x
1 a f 1
2 b f 2
3 c f 3
4 a t 4
5 b t 5
6 c t 6
> myformula <- ~0+A*H*x
> mm <- model.matrix(myformula, data=mf)
> mm
  Aa Ab Ac H1 x A1:H1 A2:H1 A1:x A2:x H1:x A1:H1:x A2:H1:x
1  1  0  0  1 1     1     0    1    0    1       1       0
2  0  1  0  1 2     0     1    0    2    2       0       2
3  0  0  1  1 3    -1    -1   -3   -3    3      -3      -3
4  1  0  0 -1 4    -1     0    4    0   -4      -4       0
5  0  1  0 -1 5     0    -1    0    5   -5       0      -5
6  0  0  1 -1 6     1     1   -6   -6   -6       6       6



More information about the R-help mailing list