# [R] Using VGAM's vglm function for ordinal logistic regression

Inman, Brant A. M.D. Inman.Brant at mayo.edu
Sat Jan 6 23:56:43 CET 2007

```R-Experts:

I am using the vglm function of the VGAM library to perform proportional
odds ordinal logistic regression.  The issue that I would like help with
concerns the format in which the response variable must be provided for
this function to work correctly. Consider the following example:

------

library(VGAM)
library(MASS)

attach(pneumo)
pneumo	# Inspect the format of the original dataset

# Restructure the pneumo dataset into a different format
pneumo2 <- data.frame(matrix(ncol=3, nrow=24))
colnames(pneumo2) <- c('exposure.time', 'severity', 'freq')
pneumo2[,1] <- rep(pneumo[,1],3)
pneumo2[,2] <-
as.ordered(c(rep('normal',8),rep('mild',8),rep('severe',8)))
pneumo2[1:8,3]   <- pneumo[,2]
pneumo2[9:16,3]  <- pneumo[,3]
pneumo2[17:24,3] <- pneumo[,4]
pneumo2	# Inspect the format of the new modified dataset

------

The problem occurs when I try to analyze these two datasets, which are
identical in content, with the proportional odds model using vglm:

------

# Analyze the original dataset with vglm, get one set of results

> vglm(vglm(cbind(normal, mild, severe) ~ log(exposure.time),
data=pneumo,
+  family=cumulative(parallel=T))

Coefficients:
(Intercept):1      (Intercept):2 log(exposure.time)
9.676093          10.581725          -2.596807

Degrees of Freedom: 16 Total; 13 Residual
Residual Deviance: 5.026826
Log-likelihood: -204.2742

# Analyzing the modified dataset with vglm gives another set of results

> vglm(severity ~ log(exposure.time), weights=freq, data=pneumo2,
+ family=cumulative(parallel=T))

Coefficients:
(Intercept):1      (Intercept):2 log(exposure.time)
-1.6306499          2.5630170         -0.1937956

Degrees of Freedom: 48 Total; 45 Residual
Residual Deviance: 503.7251
Log-likelihood: -251.8626
Warning messages:
1: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control\$wzepsilon)
2: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control\$wzepsilon)
3: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control\$wzepsilon)
4: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control\$wzepsilon)
5: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control\$wzepsilon)

# Analyzing the modified dataset with polr reproduces these second
results

> polr(severity ~ log(exposure.time), weights=freq, data=pneumo2)

Coefficients:
log(exposure.time)
0.1938154

Intercepts:
mild|normal normal|severe
-1.630612      2.563055

Residual Deviance: 503.7251
AIC: 509.7251

------

Can someone explain what I am doing wrong when using vglm and polr with
the modified dataset?  I do not understand why these two formulations
should give different results.

Brant Inman

```