[R] Interpreting effect of ordered categorical predictor

Marc Girondot marc_grt at yahoo.fr
Wed Apr 2 20:24:42 CEST 2014


(I posted this question in 
http://stackoverflow.com/questions/22781965/interpreting-effect-of-ordered-categorical-predictor 
without answer... I try here)

Thanks a lot
Marc

My question is very similar to this one 
(https://stat.ethz.ch/pipermail/r-help/2012-March/305357.html) but I 
fail to understand it fully. I would like to visualize the effect of an 
ordered categorical predictor after a glm.

First I generate some dummy data:

## data.frame with continuous values and 6 factors
datagenerate <- data.frame(measure=c(rnorm(20, 10, 2), rnorm(30, 15, 2), 
rnorm(20, 20, 2),
rnorm(20, 25, 2), rnorm(20, 30, 2), rnorm(20, 35, 2)), factor=c(rep("A", 
20), rep("B", 30),
rep("C", 20), rep("D", 20), rep("E", 20), rep("F", 20)), 
stringsAsFactors=FALSE)
nbfactors <- length(levels(datagenerate$factor))

Now I apply a glm with an unordered category:

## First factors are unordered
datagenerate$factor <- as.factor(datagenerate$factor)
essaiglm <- glm(measure ~ factor, datagenerate, family=gaussian())
coef_unordered <- coef(summary(essaiglm))[,1]
plot(1:nbfactors, c(0, coef_unordered[2:nbfactors]), type="h", bty="n", 
las=1,
xlab="Factors", ylab="Effect")

All is ok. But I would like to do the same with ordered category:

## Now factors are ordered
datagenerate$factor <- ordered(datagenerate$factor, levels=c("A", "B", 
"C", "D", "E", "F"))
essaiglm <- glm(measure ~ factor, datagenerate, family=gaussian())
coef_ordered <- coef(summary(essaiglm))[,1]

## I am not sure about this line. How the ordered factors are coded ?
x <- ((0:(nbfactors-1))-(nbfactors-1)/2)/(nbfactors-1)

y <- x*coef_ordered["factor.L"]+x^2*coef_ordered["factor.Q"]+
x^3*coef_ordered["factor.C"]+x^4*coef_ordered["factor^4"]+
x^5*coef_ordered["factor^5"]
y <- y-min(y)
plot(1:nbfactors, y, type="h", bty="n", las=1, xlab="Factors", 
ylab="Effect")
The result is highly dependent on the coding of the levels. Based on 
several tries, I propose

x <- ((0:(nbfactors-1))-(nbfactors-1)/2)/(nbfactors-1)
But I am not sure.

If someone has the answer, I will be very grateful.

Thanks a lot

Marc




More information about the R-help mailing list