[R] model.matrix for a factor effect with no intercept

David Firth d.firth at warwick.ac.uk
Wed Feb 23 20:18:48 CET 2005


Brian, many thanks for these helpful pointers:

On 23 Feb 2005, at 15:45, Prof Brian Ripley wrote:

> MASS4 p.150
> White Book p.38
>
> Those are the only two reasonably comprehensive accounts that I am 
> aware of (and they have only partial overlap).
>
> The underlying motivation is to span the _additional_ vector space 
> covered by the term, the complement to what has gone before.  Put 
> another way, as each term is added, only enough columns are added to 
> the model matrix to span the same space as if dummy coding had been 
> used for that term and its predecessors.  So think of this as a way to 
> produce a parsimonious (usually full-rank) basis for the model space.

Yes, indeed.  My surprise was that *this* particular basis (dummy 
coding) was the one used here.  I should have got a clue from what 
contrasts() does, and is documented to do,

 > options("contrasts")
$contrasts
[1] "contr.treatment" "contr.poly"
 > contrasts(a)
                 .L         .Q
[1,] -7.071068e-01  0.4082483
[2,] -9.073800e-17 -0.8164966
[3,]  7.071068e-01  0.4082483

but when there's no intercept in the model the contrasts used appear to 
be

 > contrasts(a, contrasts = FALSE)
    -1 0 1
-1  1 0 0
0   0 1 0
1   0 0 1

which are not the same as

 > contr.poly(a, contrasts = FALSE)
      ^0            ^1         ^2
[1,]  1 -7.071068e-01  0.4082483
[2,]  1 -9.073800e-17 -0.8164966
[3,]  1  7.071068e-01  0.4082483

-- which is what I had naively expected to get in my model matrix. 

This is of course all a matter of convention.  The present convention 
does seem a touch confusing though: the basis for the space spanned by 
a factor is determined by options("contrasts") or by the contrasts 
attribute of the factor or by the contrasts argument in the call, 
*except* when there's no intercept or other factor earlier in the model 
in which case all such settings are ignored (well, not *quite* ignored: 
they do get put in the contrasts attribute of the resultant model 
matrix).

On the other hand, one good reason to use dummy coding for the 
first-encountered factor when there's no intercept is that the 
associated parameters are then often interpretable as group-specific 
intercepts.

Would it be an improvement, though, if the "contrasts" attribute of the 
resultant model matrix contained "contr.treatment" in such cases 
instead of the name of a contrast function that was not actually used?

Best wishes,
David


>
> On Wed, 23 Feb 2005, David Firth wrote:
>
>> I was surprised by this (in R 2.0.1):
>>
>>> a <- ordered(-1:1)
>>> a
>> [1] -1 0  1
>> Levels: -1 < 0 < 1
>>
>>> model.matrix(~ a)
>>  (Intercept)           a.L        a.Q
>> 1           1 -7.071068e-01  0.4082483
>> 2           1 -9.073800e-17 -0.8164966
>> 3           1  7.071068e-01  0.4082483
>> attr(,"assign")
>> [1] 0 1 1
>> attr(,"contrasts")
>> attr(,"contrasts")$a
>> [1] "contr.poly"
>>
>>> model.matrix(~ -1 + a)
>>  a-1 a0 a1
>> 1   1  0  0
>> 2   0  1  0
>> 3   0  0  1
>> attr(,"assign")
>> [1] 1 1 1
>> attr(,"contrasts")
>> attr(,"contrasts")$a
>> [1] "contr.poly"
>>
>> Without the intercept, treatment contrasts seem to have been used 
>> (this despite the "contr.poly" in the "contrasts" attribute).
>>
>> It's not restricted to ordered factors.  For example, if Helmert 
>> contrasts are used for nominal factors, the same sort of thing 
>> happens.
>>
>> I suppose it is a deliberate feature (perhaps to protect the user 
>> from accidentally fitting models that make no sense?  or maybe some 
>> better reason?) -- is it explained somewhere?
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list