[R] Models with ordered and unordered factors

Paul Johnson pauljohn32 at gmail.com
Tue Nov 15 17:54:28 CET 2011


On Tue, Nov 15, 2011 at 9:00 AM, Catarina Miranda
<catarina.miranda at gmail.com> wrote:
> Hello;
>
> I am having a problems with the interpretation of models using ordered or
> unordered predictors.
> I am running models in lmer but I will try to give a simplified example
> data set using lm.
> Both in the example and in my real data set I use a predictor variable
> referring to 3 consecutive days of an experiment. It is a factor, and I
> thought it would be more correct to consider it ordered.
> Below is my example code with my comments/ideas along it.
> Can someone help me to understand what is happening?

Dear Catarina:

I have had the same question, and I hope my answers help you
understand what's going on.

The short version:

http://pj.freefaculty.org/R/WorkingExamples/orderedFactor-01.R

The longer version, "Working with Ordinal Predictors"

http://pj.freefaculty.org/ResearchPapers/MidWest09/Midwest09.pdf

HTH
pj

>
> Thanks a lot in advance;
>
> Catarina Miranda
>
>
> y<-c(72,25,24,2,18,38,62,30,78,34,67,21,97,79,64,53,27,81)
>
> Day<-c(rep("Day 1",6),rep("Day 2",6),rep("Day 3",6))
>
> dataf<-data.frame(y,Day)
>
> str(dataf) #Day is not ordered
> #'data.frame':   18 obs. of  2 variables:
> # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
> # $ Day: Factor w/ 3 levels "Day 1","Day 2",..: 1 1 1 1 1 1 2 2 2 2 ...
>
> summary(lm(y~Day,data=dataf))  #Day 2 is not significantly different from
> Day 1, but Day 3 is.
> #
> #Call:
> #lm(formula = y ~ Day, data = dataf)
> #
> #Residuals:
> #    Min      1Q  Median      3Q     Max
> #-39.833 -14.458  -3.833  13.958  42.167
> #
> #Coefficients:
> #            Estimate Std. Error t value Pr(>|t|)
> #(Intercept)   29.833      9.755   3.058 0.00797 **
> #DayDay 2      18.833     13.796   1.365  0.19234
> #DayDay 3      37.000     13.796   2.682  0.01707 *
> #---
> #Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #
> #Residual standard error: 23.9 on 15 degrees of freedom
> #Multiple R-squared: 0.3241,     Adjusted R-squared: 0.234
> #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297
> #
>
> dataf$Day<-ordered(dataf$Day)
>
> str(dataf) # "Day 1"<"Day 2"<"Day 3"
> #'data.frame':   18 obs. of  2 variables:
> # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
> # $ Day: Ord.factor w/ 3 levels "Day 1"<"Day 2"<..: 1 1 1 1 1 1 2 2 2 2 ...
>
> summary(lm(y~Day,data=dataf)) #Significances reversed (or "Day.L" and
> "Day.Q" are not sinonimous "Day 2" and "Day 3"?): Day 2 (".L") is
> significantly different from Day 1, but Day 3 (.Q) isn't.
>
> #Call:
> #lm(formula = y ~ Day, data = dataf)
> #
> #Residuals:
> #    Min      1Q  Median      3Q     Max
> #-39.833 -14.458  -3.833  13.958  42.167
> #
> #Coefficients:
> #            Estimate Std. Error t value Pr(>|t|)
> #(Intercept)  48.4444     5.6322   8.601 3.49e-07 ***
> #Day.L        26.1630     9.7553   2.682   0.0171 *
> #Day.Q        -0.2722     9.7553  -0.028   0.9781
> #---
> #Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #
> #Residual standard error: 23.9 on 15 degrees of freedom
> #Multiple R-squared: 0.3241,     Adjusted R-squared: 0.234
> #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas



More information about the R-help mailing list