# [R] Ordinal categorical data with GLM

Andrew Criswell arc at arcriswell.com
Thu Apr 11 18:41:27 CEST 2002

```Hello All:

I am trying to replicate the results of an example found in Alan
Agresti's "Categorical Data Analysis" on pages 267-269. The example is
one of a 2 x 2 cross-classification table of ordinal counts: job
satisfaction and income.

I am able to get Agresti's results for the independence model (G^2 =
12.03 with df = 9) assuming as he does that the data is nominal, but I'm
unable to derive his model of uniform association (linear-by-linear
association, p. 263-269) for which he gets a value of G^2 = 2.39 with df
= 8.

The observed data is represented by table 8.2 on page 268 as follows:

> Freq <- c(20, 24,  80,  82,
+           22, 38, 104, 125,
+           13, 28,  81, 113,
+            7, 18,  54,  92)
>
> data.3 <- t(matrix(Freq, nrow = 4))
>
> list.3 <- list(Income       = c("< 6,000",
+                                 "6,000-15,000",
+                                 "15,000-25,000",
+                                 "> 25,000"),
+                Satisfaction = c("Very dissatisfied",
+                                 "Little dissatisfied",
+                                 "Moderately satisfied",
+                                 "Very satisfied"))
>
> dimnames(data.3) <- list.3
>
> ftable(data.3)
Satisfaction Very dissatisfied Little dissatisfied
Moderately satisfied Very satisfied
Income

< 6,000                                   20
24                   80             82
6,000-15,000                              22
38                  104            125
15,000-25,000                             13
28                   81            113
> 25,000                                   7
18                   54             92
>

I am able to obtain Agresti's results for the independence model which
assumes the data is nominal, not ordinal, using either glm() or loglm().

> library(MASS)
> options(contrasts=c("contr.sum", "contr.poly"))
>
> X <- as.integer(gl(4, 4, 16)) - 1
> Y <- as.integer(gl(4, 1, 16)) - 1
>
> data.2 <- data.frame(Freq, X = factor(X), Y = factor(Y))
>
> summary(fm3 <- glm(Freq ~ X + Y, data = data.2,
+                    family = poisson()))

Call:
glm(formula = Freq ~ X + Y, family = poisson(), data = data.2)

Deviance Residuals:
Min        1Q    Median        3Q       Max
-1.50416  -0.67501  -0.08592   0.53800   1.51852

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  3.74425    0.04444  84.259  < 2e-16 ***
X1          -0.07101    0.05982  -1.187    0.235
X2           0.26754    0.05368   4.984 6.22e-07 ***
X3           0.06070    0.05726   1.060    0.289
Y1          -1.02174    0.09995 -10.222  < 2e-16 ***
Y2          -0.46674    0.08101  -5.761 8.35e-09 ***
Y3           0.61632    0.05917  10.416  < 2e-16 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 445.763  on 15  degrees of freedom
Residual deviance:  12.037  on  9  degrees of freedom
AIC: 115.07

Number of Fisher Scoring iterations: 3

> dummy.coef(fm3)
Full coefficients are

(Intercept):       3.744253
X:                        0           1           2           3
-0.07101181  0.26753870  0.06069753 -0.25722441
Y:                        0           1           2           3
-1.0217353  -0.4667389   0.6163210   0.8721532
>
> fm4 <- loglm(Freq ~ X + Y, data = data.2, param = T, fit = T)
> fm4;  fm4\$param
Call:
loglm(formula = Freq ~ X + Y, data = data.2, param = T, fit = T)

Statistics:
X^2 df  P(> X^2)
Likelihood Ratio 12.03686  9 0.2112391
Pearson          11.98857  9 0.2139542
\$"(Intercept)"
 3.744253

\$X
0           1           2           3
-0.07101181  0.26753871  0.06069753 -0.25722443

\$Y
0          1          2          3
-1.0217356 -0.4667388  0.6163211  0.8721533

>

My question is this:  can glm() or some other function be used in the
manner Agresti employed for ordinal count data?

Thank you,
ANDREW

Andrew Criswell
Professor of Finance