[R] Unexplained behavior of level names when using ordered factors in lm?

Bert Gunter gunter.berton at gene.com
Fri Dec 2 16:08:49 CET 2011

```Maybe should have explicitly said:

> C(ordered(1:5))
[1] 1 2 3 4 5
attr(,"contrasts")
ordered
contr.poly
Levels: 1 < 2 < 3 < 4 < 5

-- Bert

On Fri, Dec 2, 2011 at 7:06 AM, Bert Gunter <bgunter at gene.com> wrote:
> ?ordered
> ?C
> ?contr.poly
>
> If you don't know what polynomial contrasts are, consult any good
> linear models text. MASS has a good, though a bit terse, section on
> this.
>
> -- Bert
>
> On Fri, Dec 2, 2011 at 6:51 AM, Tal Galili <tal.galili at gmail.com> wrote:
>> Hello dear all,
>>
>> I am unable to understand why when I run the following three lines:
>>
>> set.seed(4254)
>>> a <- data.frame(y = rnorm(40), x=ordered(sample(1:5, 40, T)))
>>> summary(lm(y ~ x, a))
>>
>>
>> The output I get includes factor levels which are not relevant to what I am
>> actually using:
>>
>> Call:
>>> lm(formula = y ~ x, data = a)
>>> Residuals:
>>>     Min      1Q  Median      3Q     Max
>>> -1.4096 -0.6400 -0.1244  0.5886  2.1891
>>> Coefficients:
>>>             Estimate Std. Error t value Pr(>|t|)
>>> (Intercept) -0.03276    0.15169  -0.216    0.830
>>> x.L         -0.28968    0.33866  -0.855    0.398
>>> x.Q         -0.38813    0.33851  -1.147    0.259
>>> x.C         -0.27183    0.34027  -0.799    0.430
>>> x^4          0.25993    0.33935   0.766    0.449
>>> Residual standard error: 0.9564 on 35 degrees of freedom
>>> Multiple R-squared: 0.08571, Adjusted R-squared: -0.01878
>>> F-statistic: 0.8202 on 4 and 35 DF,  p-value: 0.5211
>>
>>
>> I am guessing that this is having something to do with the contrast matrix
>> that is used, but this is not clear to me.
>> Can anyone suggest a good read, or an explanation?
>>
>> Thanks.
>>
>>
>
>
>
