[R] Understanding linear contrasts in Anova using R

Thu Sep 30 22:05:54 CEST 2010

On 09/30/2010 08:31 PM, Max Kuhn wrote:
> These two resources might also help:
> 
>    http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf
>    http://cran.r-project.org/web/packages/contrast/vignettes/contrast.pdf
> 
> Max

They're a tad long though. Let me try and say it shorter:

Contrast calculation and contrast parametrization are eachother's
(g-)inverses.

E.g.

> m
     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    1    0    0
[3,]    1    1    0
[4,]    1    1    1
> solve(cbind(1,m))
     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]   -1    1    0    0
[3,]    0   -1    1    0
[4,]    0    0   -1    1

So you can parametrize with successive differences using m but it is the
other matrix that is used to compute successive differences.

(There's quite some confusion going on caused by the obsession with
ORTHOGONAL contrasts of earlier days. For those kinds of contrasts
(only!), the inverse equals the transpose save for scaling factors. In
the case of polynomial contrasts, they are even designed to be
orthonormal so that the transpose equals the g-inverse)

> 
> 
> On Thu, Sep 30, 2010 at 1:33 PM, Ista Zahn <izahn at psych.rochester.edu> wrote:
>> Hi Professor Howell,
>> I think the issue here is simply in the assumption that the regression
>> coefficients will always be equal to the product of the means and the
>> contrast codes. I tend to think of regression coefficients as the
>> quotient of the covariance of x and y divided by the variance of x,
>> and this definition agrees with the coefficients calculated by lm().
>> See below for a long-winded example.
>>
>> On Wed, Sep 29, 2010 at 3:42 PM, David Howell <David.Howell at uvm.edu> wrote:
>>>  #I am trying to understand how R fits models for contrasts in a
>>> #simple one-way anova. This is an example, I am not stupid enough to want
>>> #to simultaneously apply all of these contrasts to real data. With a few
>>> #exceptions, the tests that I would compute by hand (or by other software)
>>> #will give the same t or F statistics. It is the contrast estimates that R
>>> produces
>>> #that I can't seem to understand.
>>> #
>>> # In searching for answers to this problem, I found a great PowerPoint slide
>>> (I think by John Fox).
>>> # The slide pointed to the coefficients, said something like "these are
>>> coeff. that no one could love," and
>>> #then suggested looking at the means to understand where they came from. I
>>> have stared
>>> # and stared at his means and then my means, but can't find a relationship.
>>>
>>> # The following code and output illustrates the problem.
>>>
>>> # Various examples of Anova using R
>>>
>>> dv <- c(1.28,  1.35,  3.31,  3.06,  2.59,  3.25,  2.98,  1.53, -2.68,  2.64,
>>>  1.26,  1.06,
>>>       -1.18,  0.15,  1.36,  2.61,  0.66,  1.32,  0.73, -1.06,  0.24,  0.27,
>>>  0.72,  2.28,
>>>       -0.41, -1.25, -1.33, -0.47, -0.60, -1.72, -1.74, -0.77, -0.41, -1.20,
>>> -0.31, -0.74,
>>>       -0.45,  0.54, -0.98,  1.68,  2.25, -0.19, -0.90,  0.78,  0.05,  2.69,
>>>  0.15,  0.91,
>>>        2.01,  0.40,  2.34, -1.80,  5.00,  2.27,  6.47,  2.94,  0.47,  3.22,
>>>  0.01, -0.66)
>>>
>>> group <- factor(rep(1:5, each = 12))
>>>
>>>
>>> # Use treatment contrasts to compare each group to the first group.
>>> options(contrasts = c("contr.treatment","contr.poly"))  # The default
>>> model2 <- lm(dv ~ group)
>>> summary(model2)
>>>  # Summary table is the same--as it should be
>>>  # Intercept is Group 1 mean and other coeff. are deviations from that.
>>>  # This is what I would expect.
>>>  #summary(model1)
>>>  #              Df Sum Sq Mean Sq F value    Pr(>F)
>>>  #  group        4  62.46 15.6151  6.9005 0.0001415 ***
>>>  #  Residuals   55 124.46  2.2629
>>>  #Coefficients:
>>>  #            Estimate Std. Error t value Pr(>|t|)
>>>  #(Intercept)  1.80250    0.43425   4.151 0.000116 ***
>>>  #group2      -1.12750    0.61412  -1.836 0.071772 .
>>>  #group3      -2.71500    0.61412  -4.421 4.67e-05 ***
>>>  #group4      -1.25833    0.61412  -2.049 0.045245 *
>>>  #group5       0.08667    0.61412   0.141 0.888288
>>>
>>>
>>> # Use sum contrasts to compare each group against grand mean.
>>> options(contrasts = c("contr.sum","contr.poly"))
>>> model3 <- lm(dv ~ group)
>>> summary(model3)
>>>
>>>  # Again, this is as expected. Intercept is grand mean and others are
>>> deviatoions from that.
>>>  #Coefficients:
>>>  #              Estimate Std. Error t value Pr(>|t|)
>>>  #  (Intercept)   0.7997     0.1942   4.118 0.000130 ***
>>>  #  group1        1.0028     0.3884   2.582 0.012519 *
>>>  #  group2       -0.1247     0.3884  -0.321 0.749449
>>>  #  group3       -1.7122     0.3884  -4.408 4.88e-05 ***
>>>  #  group4       -0.2555     0.3884  -0.658 0.513399
>>>
>>> #SO FAR, SO GOOD
>>>
>>> # IF I wanted polynomial contrasts BY HAND I would use
>>> #    a(i) =  -2   -1   0   1   2   for linear contrast        (or some
>>> linear function of this )
>>> #    Effect = Sum(a(j)M(i))    # where M = mean
>>> #    Effect(linear) = -2(1.805) -1(0.675) +0(-.912) +1(.544) +2(1.889) =
>>> 0.043
>>> #    SS(linear) = n*(Effect(linear)^2)/Sum((a(j)^2))  = 12(.043)/10 = .002
>>> #    F(linear) = SS(linear)/MS(error) = .002/2.263 = .001
>>> #    t(linear) = sqrt(.001) = .031
>>>
>>> # To do this in R I would use
>>> order.group <- ordered(group)
>>> model4 <- lm(dv~order.group)
>>> summary(model4)
>>> #  This gives:
>>>    #Coefficients:
>>> #                  Estimate Std. Error t value Pr(>|t|)
>>> #    (Intercept)    0.79967    0.19420   4.118 0.000130 ***
>>> #    order.group.L  0.01344    0.43425   0.031 0.975422
>>> #    order.group.Q  2.13519    0.43425   4.917 8.32e-06 ***
>>> #    order.group.C  0.11015    0.43425   0.254 0.800703
>>> #    order.group^4 -0.79602    0.43425  -1.833 0.072202 .
>>>
>>> # The t value for linear is same as I got (as are others) but I don't
>>> understand
>>> # the estimates. The intercept is the grand mean, but I don't see the
>>> relationship
>>> # of other estimates to that or to the ones I get by hand.
>>> # My estimates are the sum of (coeff times means) i.e.  0 (intercept),
>>> .0425, 7.989, .3483, -6.66
>>> # and these are not a linear (or other nice pretty) function of est. from R.
>>
>> # OK, let's break it down
>> Means <- tapply(dv, order.group, mean)
>> coefs.by.hand.m4 <- c(mean(Means),
>>                   sum(Means*contrasts(order.group)[,1]),
>>                   sum(Means*contrasts(order.group)[,2]),
>>                   sum(Means*contrasts(order.group)[,3]),
>>                   sum(Means*contrasts(order.group)[,4]))
>> (coefs.check.m4 <- rbind(coef(model4), coefs.by.hand.m4))
>> ## OK, so these coefficeints are in fact the sum of the product of the
>> contrast codes and the group means. The only difference between this
>> and what you got by hand calculation should be in the actual contrast
>> codes used. You don't say what they are, but I'll assume you did the
>> standard thing, like this:
>> contrasts.by.hand <- cont.by.hand <- matrix(c(-2, -1, 0, 1, 2, 2, -1,
>> -2, -1, 2, -1, 2, 0, -2, 1, 1, -4, 6, -4, 1), byrow=FALSE, ncol=4,
>> dimnames=list(1:5, c("L", "Q", "C", "x4")))
>> contrasts(order.group) <- contrasts.by.hand
>>
>> model5 <- lm(dv~order.group)
>> summary(model5)
>>
>> coefs.by.hand.m5 <- c(mean(Means),
>>                   sum(Means*contrasts(order.group)[,1]),
>>                   sum(Means*contrasts(order.group)[,2]),
>>                   sum(Means*contrasts(order.group)[,3]),
>>                   sum(Means*contrasts(order.group)[,4]))
>> (coefs.check.m5 <- rbind(coef(model5), coefs.by.hand.m5))
>>
>> ## Not the same, and not a linear function of one another
>>
>>
>> ## OK, let's see what's going on here. We can define the regression
>> coefficient b_i as cov(x_iy)/var(x). First check model 4
>> mm4 <- as.data.frame(cbind(dv, model.matrix(model4)[,-1]))
>>
>> cov.L.m4 <- cov(mm4[,"dv"], mm4[,"order.group.L"])
>> cov.Q.m4 <- cov(mm4[,"dv"], mm4[,"order.group.Q"])
>> cov.C.m4 <- cov(mm4[,"dv"], mm4[,"order.group.C"])
>> cov.x4.m4 <- cov(mm4[,"dv"], mm4[,"order.group^4"])
>>
>> var.L.m4 <- var(mm4[,"order.group.L"])
>> var.Q.m4 <- var(mm4[,"order.group.Q"])
>> var.C.m4 <- var(mm4[,"order.group.C"])
>> var.x4.m4 <- var(mm4[,"order.group^4"])
>>
>> covs.m4 <- c(cov.L.m4, cov.Q.m4, cov.C.m4, cov.x4.m4)
>> vars.m4 <- c(var.L.m4, var.Q.m4, var.C.m4, var.x4.m4)
>> coefs.by.hand.m4.2 <- c(mean(dv), covs.m4/vars.m4)
>>
>> (coefs.check.m4.2 <- rbind(coef(model4), coefs.by.hand.m4.2))
>> ## OK those are equal. Now try for model 5
>>
>> mm5 <- as.data.frame(cbind(dv, model.matrix(model5)[,-1]))
>> names(mm5) <- names(mm4)
>>
>> cov.L.m5 <- cov(mm5[,"dv"], mm5[,"order.group.L"])
>> cov.Q.m5 <- cov(mm5[,"dv"], mm5[,"order.group.Q"])
>> cov.C.m5 <- cov(mm5[,"dv"], mm5[,"order.group.C"])
>> cov.x4.m5 <- cov(mm5[,"dv"], mm5[,"order.group^4"])
>>
>> var.L.m5 <- var(mm5[,"order.group.L"])
>> var.Q.m5 <- var(mm5[,"order.group.Q"])
>> var.C.m5 <- var(mm5[,"order.group.C"])
>> var.x4.m5 <- var(mm5[,"order.group^4"])
>>
>> covs.m5 <- c(cov.L.m5, cov.Q.m5, cov.C.m5, cov.x4.m5)
>> vars.m5 <- c(var.L.m5, var.Q.m5, var.C.m5, var.x4.m5)
>> coefs.by.hand.m5.2 <- c(mean(dv), covs.m5/vars.m5)
>>
>> (coefs.check.m5.2 <- rbind(coef(model5), coefs.by.hand.m5.2))
>>
>> ## Those are equal. OK, so we see that the coefficients are
>> consistently equal to cov(xy)/var(x), but only equal to
>> sum(contrasts(x)*Means) under certain circumstances. What are those
>> circumstances? Let's try scaling our contrasts so they have the same
>> variance
>>
>> contrasts.by.hand.scaled <- scale(contrasts.by.hand)
>> contrasts(order.group) <- contrasts.by.hand.scaled
>>
>> model7 <- lm(dv~order.group)
>> summary(model7)
>>
>> coefs.by.hand.m7 <- c(mean(Means),
>>                   sum(Means*contrasts(order.group)[,1]),
>>                   sum(Means*contrasts(order.group)[,2]),
>>                   sum(Means*contrasts(order.group)[,3]),
>>                   sum(Means*contrasts(order.group)[,4]))
>> (coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
>>
>> ## Not the same, but they are a linear function of one another:
>> coefs.by.hand.m7 <- c(mean(Means),
>>                   sum(Means*contrasts(order.group)[,1])/4,
>>                   sum(Means*contrasts(order.group)[,2])/4,
>>                   sum(Means*contrasts(order.group)[,3])/4,
>>                   sum(Means*contrasts(order.group)[,4])/4)
>> (coefs.check.m7 <- rbind(coef(model7), coefs.by.hand.m7))
>>
>> ## Hopefully this clarifies what the coefficients calculated by the
>> lm() function represent, i.e., cor(xy)/var(x)
>>
>>>
>>> # Just as a check, I forced intercept through zero with with deviation
>>> scores or -1 in model.
>>> #  Now force intercept to 0 by using deviation scores
>>>
>>> devdv <- dv-mean(dv)
>>> model5 <- lm(devdv~order.group)
>>> summary(model5)
>>> #Same as above except intercept = 0
>>>
>>> # Now do it by removing the intercept
>>> model6 <- lm(dv~order.group -1)
>>> summary(model6)
>>> #  Weird because coeff = cell means
>>>    #             Estimate Std. Error t value Pr(>|t|)
>>> #    order.group1   1.8025     0.4342   4.151 0.000116 ***
>>> #    order.group2   0.6750     0.4342   1.554 0.125824
>>> #    order.group3  -0.9125     0.4342  -2.101 0.040208 *
>>> #    order.group4   0.5442     0.4342   1.253 0.215464
>>> #    order.group5   1.8892     0.4342   4.350 5.94e-05 ***
>>>
>>> # BUT note that row labels would suggest that we were not solving for
>>> polynomials,
>>> # but the contrast matrix is made up of polynomial coefficients.
>>
>> ## I think you're correct that by removing the intercept lm() is no
>> longer using contrast coding. I've never really understood regression
>> models without intercept terms I'm afraid...
>>
>>>
>>> # contrasts(order.group)
>>> #                .L         .Q            .C         ^4
>>> #[1,] -6.324555e-01  0.5345225 -3.162278e-01  0.1195229
>>> #[2,] -3.162278e-01 -0.2672612  6.324555e-01 -0.4780914
>>> #[3,] -3.287978e-17 -0.5345225  1.595204e-16  0.7171372
>>> #[4,]  3.162278e-01 -0.2672612 -6.324555e-01 -0.4780914
>>> #[5,]  6.324555e-01  0.5345225  3.162278e-01  0.1195229
>>>
>>> #So, when I use polynomials (either through contr.poly or by supplying a
>>> matrix of
>>> #coefficients, what do the estimates represent?
>>
>> I hope my examples have clarified this. They represent the increase in
>> y for a one-unit increase in X. What that one-unit increase represents
>> is of course a function of the contrast codes used.
>>
>>>
>>> # My problem is not restricted to polynomials. If I try a set of orthogonal
>>> linear contrasts
>>> # on group means I get
>>> #contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0), c(-0,0,0,1,-1),
>>> c(1,1,4,-1,1))
>> # These are not orthogonal:
>> cor(contrasts(group))
>> ## fixing that gives us
>> contrasts(group) <- cbind(c(1,1,0,-1,-1), c(1,-1,0,0,0),
>> c(-0,0,0,1,-1), c(1,1,-4,1,1))
>> model3 <- lm(dv ~ group)
>> summary(model3)
>>
>> ## These coefficients are functions of the specified contrasts:
>> coefs.by.hand.m3 <-  c(mean(Means),
>>                       (mean(Means[1:2]) - mean(Means[4:5]))/2,
>>                       (Means[1] - Means[2])/2,
>>                       (Means[4] - Means[5])/2,
>>                       (mean(Means[c(1,2,4,5)]) - Means[3])/5)
>> ## Note that we divide each mean difference by the difference of the contrasts
>> (coef.check.m3 <- rbind(coef(model3), coefs.by.hand.m3))
>>
>> Hope it helps,
>> Ista
>>
>>> #model3 <- lm(dv ~ group)
>>> #summary(model3)
>>> #Coefficients:
>>> #            Estimate Std. Error t value Pr(>|t|)
>>> #(Intercept)   1.5335     0.2558   5.995 1.64e-07 ***
>>> #group1        0.3168     0.2279   1.390 0.170185
>>> #group2        0.5638     0.3071   1.836 0.071772 .
>>> #group3       -1.2840     0.3369  -3.811 0.000352 ***
>>> #group4       -0.6115     0.1387  -4.408 4.88e-05 ***
>>> #These are not even close to what I would expect. By hand I would compute
>>> the contrasts as
>>> # .0442, 1.1275, 1.3450, and 8.5608 with different t values.
>>>
>>> # Any help would be appreciated.
>>>
>>> Thanks,
>>> Dave Howell
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Ista Zahn
>> Graduate student
>> University of Rochester
>> Department of Clinical and Social Psychology
>> http://yourpsyche.org
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com