[R] Interpreting model matrix columns when using contr.sum

Gang Chen gangchen6 at gmail.com
Fri Jan 23 23:58:03 CET 2009


With the following example using contr.sum for both factors,

> dd <- data.frame(a = gl(3,4), b = gl(4,1,12))     # balanced 2-way
> model.matrix(~ a * b, dd, contrasts = list(a="contr.sum", b="contr.sum"))

   (Intercept) a1 a2 b1 b2 b3 a1:b1 a2:b1 a1:b2 a2:b2 a1:b3 a2:b3
1            1  1  0  1  0  0     1     0     0     0     0     0
2            1  1  0  0  1  0     0     0     1     0     0     0
3            1  1  0  0  0  1     0     0     0     0     1     0
4            1  1  0 -1 -1 -1    -1     0    -1     0    -1     0
5            1  0  1  1  0  0     0     1     0     0     0     0
6            1  0  1  0  1  0     0     0     0     1     0     0
7            1  0  1  0  0  1     0     0     0     0     0     1
8            1  0  1 -1 -1 -1     0    -1     0    -1     0    -1
9            1 -1 -1  1  0  0    -1    -1     0     0     0     0
10           1 -1 -1  0  1  0     0     0    -1    -1     0     0
11           1 -1 -1  0  0  1     0     0     0     0    -1    -1
12           1 -1 -1 -1 -1 -1     1     1     1     1     1     1
...

I have two questions:

(1) I assume the 1st column (under intercept) is the overall mean, the
2rd column (under a1) is the difference between the 1st level of
factor a and the overall mean, the 4th column (under b1) is the
difference between the 1st level of factor b and the overall mean. Is
this interpretation correct?

(2) I'm not so sure about those interaction columns. For example, what
is a1:b1? Is it the 1st level of factor a at the 1st level of factor b
versus the overall mean, or something more complicated?

Thanks in advance for your help,
Gang




More information about the R-help mailing list