[R] Help with categorical predicrots in regression models

Pamela Foggia pamela.foggia at gmail.com
Fri Jun 19 23:32:59 CEST 2015


Hello,
In my regression models (linear and logistic models) I have two predictor
variables, both are categorical variables: DEGREE and REGION.

DEGREE is for educational level, that is an ordinal variable with five
levels (0-LT HIGH SCHOOL, 1-HIGH SCHOOL, 2-JUNIOR COLLEGE, 3-BACHELOR,
4-GRADUATE).

REGION is for the region of the respondent, that is a nominal variable with
9 levels (1-NEW ENGLAND, 2-MIDDLE ATLANTIC, 3-E. NOR. CENTRAL, 4-W. NOR.
CENTRAL, 5-SOUTH ATLANTIC, 6-E. SOU. CENTRAL, 7-W. SOU. CENTRAL, 8-
 MOUNTAIN, 9-PACIFIC).

In many examples I read that, in order to use correctly these predictors as
categorical variables, I have to use before the FACTOR function, for
example in this way

fit1 <- lm(Z ~ factor(X) + factor(Y))
fit2 <- glm(W ~ factor(x) + factor(Y), family=binomial(link="logit"))

obtaining the following output for the logistic regression

                               coef.est coef.se
(Intercept)                 1.027    0.263
factor(DEGREE)1         0.301    0.134
factor(DEGREE)2         0.340    0.211
factor(DEGREE)3         0.748    0.168
factor(DEGREE)4         1.267    0.237
...

where clearly Z is a continuous variable and W is a binary variable. My
question is: as far as the ordinal variable X is concerned, would it be
more correct to use the ORDERED function rather than FACTOR? I mean an
operation like this

fit1 <- lm(Z ~ ordered(X) + factor(Y))
fit2 <- glm(W ~ ordered(x) + factor(Y), family=binomial(link="logit"))

where I obtain a different output like this

                                    coef.est coef.se
(Intercept)                      1.558    0.241
ordered(DEGREE).L           0.942    0.157
ordered(DEGREE).Q          0.215    0.160
ordered(DEGREE).C          0.118    0.111
ordered(DEGREE)^4        -0.106    0.143
...

What do the letters L, Q, C and the power ^4 (which I find in the output)
mean?

Thanks in advance

	[[alternative HTML version deleted]]



More information about the R-help mailing list