[R] Constrain coefs. in linear model to sum to 0

Martin Maechler maechler at stat.math.ethz.ch
Tue Aug 8 10:23:48 CEST 2006

```>>>>> "BillV" ==   <Bill.Venables at csiro.au>
>>>>>     on Tue, 8 Aug 2006 15:52:20 +1000 writes:

>> Hello!
>>
>> I would like to use constrain to sum coeficients of a
>> factor to 0 instead of classical corner contraint i.e. I
>> would like to fit a model like
>>
>> lm(y ~ 1 + effectA + effectB)
>>
>> and say get parameters
>>
>> intercept
>> effectA_1
>> effectA_2
>> effectB_1
>> effectB_2
>> effectB_3
>>
>> where effectA_1 represents deviation of level A_1 from intercept and
>> sum(effectA_1, effectA_2) = 0 and the same for factor B.
>>
>> Is this possible to do?

BillV> Yes, but not quite as simply as you would like.  If you set

BillV> options(contrasts = c("contr.sum", "contr.poly"))

BillV> for example, then factor models are parametrised as
BillV> you wish above, BUT you don't get all the effects
BillV> directly

BillV> In your case above, for example, if fm is the fitted
BillV> model object, then

BillV> coef(fm)

BillV> Will give you intercept, effectA_2, effectB_2,
BillV> effectB_3.  The remaining effects*_1 you will need to
BillV> calculate as the negative of the sum of all the
BillV> others.

BillV> This gets a bit more complicated when you have
BillV> crossed terms, a*b, but the same principle applies.

Further, note that there have been functions

dummy.coef()
and
model.tables()

that have been intended to do exactly this.
Unfortunately, these two functions only work correctly in "simpler"
cases (e.g., complete balanced designs, IIRC)

E.g. help(model.tables) has

>> Warning:
>>
>>      The implementation is incomplete, and only the simpler cases have
>>      been tested thoroughly.

and  help(dummy.coef)

>> Details:
>>
>>   [........]
>>
>>      The method used has some limitations, and will give incomplete
>>      results for terms such as 'poly(x, 2))'.  However, it is adequate
>>      for its main purpose, 'aov' models.
>>
>>   [........]
>>
>> Warning:
>>
>>      This function is intended for human inspection of the output: it
>>      should not be used for calculations.  Use coded variables for all
>>      calculations.
>>
>>      The results differ from S for singular values, where S can be
>>      incorrect.

When we last discussed this, IIRC,
we (some within R-core)  were waiting for interested (and
knowledgable) users to provide improved code for these functions.
Hey, if you want to become immortal in the R annals, now is your
chance! ;-)

BillV> Bill Venables.

>>
>> Lep pozdrav / With regards,
>> Gregor Gorjanc
> [.........]

>> ----------------------------------------------------------------------
>> "One must learn by doing the thing; for though you think you know it,
>> you have no certainty until you try." Sophocles ~ 450 B.C.