# [R] analysis of covariance and constrained parameters

Steven Orzack orzack at freshpond.org
Fri Feb 16 22:14:51 CET 2018

```Consider an analysis of covariance involving age and cohort. The goal is
to assess whether the influence of cohort

depends upon the age. The simplest case involves data as follows

value Age Cohort

x1       1       3

x2       1       4

x3       1       5

x4       2       3

x5       2       4

x6       2       5

etc.

Age is a factor. The numeric response variable is value and Cohort is a
numeric predictor. So, (pseudo-code) commands to

estimate the age=specific relationship between value and Cohort could be

glm(value ~ Age/Cohort -  1, family =......, data = .....)

glm(value ~ Age/(Cohort + I(Cohort^2)) - 1, family =......, data = .....).

The latter commands would provide estimates of the age-specific
intercept, linear, and quadratic coefficients, as in

value_Age1 <- intercept_Age1 + linear_Age1*Cohort + quad_Age1*Cohort^2

value_Age2 <- intercept_Age2 + linear_Age2*Cohort + quad_Age2*Cohort^2

This is standard. One would choose among the above models via analysis
of variance or AIC.

Now assume that I have external knowledge that tells me that there is NO
influence of Cohort on value for Age1 and that

there could be up to a quadratic influence for Age2. Accordingly, I
would like to

fit a model which estimates these relationships:

value_Age1 <- intercept_Age1 (+ 0*Cohort + 0*Cohort^2)
(which is, of course, value_Age1 <-
intercept_Age1)

value_Age2 <- intercept_Age2 + linear_Age2*Cohort + quad_Age2*Cohort^2

What is the glm syntax to fit this model? It is a model in which we have
constraints that (two) coefficients for one level of the factor must
have a particular value (0) and

there is no such constraint for the second level of the factor.

Please note that I understand that

glm(value ~ Age/(Cohort + I(Cohort^2)) - 1, family =......, data = .....).

generates point estimates of the linear and quadratic coefficients for
Age1 (as above) and one could inspect them to determine whether they are
statistically equivalent to 0.

However, I want to incorporate the knowledge that these coefficients
MUST BE 0 into my hypothesis testing. Knowing that these coefficients
are 0 could influence the results of

anova and AIC comparisons since it reduces the number of degrees of
freedom associated with model.

Many thanks for suggestions in advance!

--
Steven Orzack
Fresh Pond Research Institute
173 Harvey Street
Cambridge, MA 02140
617 864-4307

www.freshpond.org

```