[R] Using VGAM's vglm function for ordinal logistic regression
Inman, Brant A. M.D.
Inman.Brant at mayo.edu
Sun Jan 7 17:27:34 CET 2007
Thank you for the help. Indeed, the differences in the results that I
noted were due to the incorrect ordering of the response variable that
resulted from my unattentive use of as.ordered on a character vector.
Brant
-----Original Message-----
From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
Sent: Sunday, January 07, 2007 2:52 AM
To: Inman, Brant A. M.D.
Cc: r-help at stat.math.ethz.ch; yee at stat.auckland.ac.nz
Subject: Re: [R] Using VGAM's vglm function for ordinal logistic
regression
On Sat, 6 Jan 2007, Inman, Brant A. M.D. wrote:
>
> R-Experts:
>
> I am using the vglm function of the VGAM library to perform
proportional
> odds ordinal logistic regression. The issue that I would like help
with
> concerns the format in which the response variable must be provided
for
> this function to work correctly.
> pneumo2$severity
[1] normal normal normal normal normal normal normal normal mild
mild
[11] mild mild mild mild mild mild severe severe severe
severe
[21] severe severe severe severe
Levels: mild < normal < severe
is different from the ordering in the first example.
The difference between vglm and polr is that the latter uses the
conventional sign for the coefficients: see MASS4 p.204.
I would never use as.ordered on a character vector, as this leaves it to
R
to choose the ordering. (Even if you think you intended alphabetical
order, that depends on the locale: see the warnings on the help page.)
> Consider the following example:
>
> ------
>
> library(VGAM)
> library(MASS)
>
> attach(pneumo)
> pneumo # Inspect the format of the original dataset
>
> # Restructure the pneumo dataset into a different format
> pneumo2 <- data.frame(matrix(ncol=3, nrow=24))
> colnames(pneumo2) <- c('exposure.time', 'severity', 'freq')
> pneumo2[,1] <- rep(pneumo[,1],3)
> pneumo2[,2] <-
> as.ordered(c(rep('normal',8),rep('mild',8),rep('severe',8)))
> pneumo2[1:8,3] <- pneumo[,2]
> pneumo2[9:16,3] <- pneumo[,3]
> pneumo2[17:24,3] <- pneumo[,4]
> pneumo2 # Inspect the format of the new modified dataset
>
> ------
>
> The problem occurs when I try to analyze these two datasets, which are
> identical in content, with the proportional odds model using vglm:
>
> ------
>
> # Analyze the original dataset with vglm, get one set of results
>
>> vglm(vglm(cbind(normal, mild, severe) ~ log(exposure.time),
> data=pneumo,
> + family=cumulative(parallel=T))
>
> Coefficients:
> (Intercept):1 (Intercept):2 log(exposure.time)
> 9.676093 10.581725 -2.596807
>
> Degrees of Freedom: 16 Total; 13 Residual
> Residual Deviance: 5.026826
> Log-likelihood: -204.2742
>
> # Analyzing the modified dataset with vglm gives another set of
results
>
>> vglm(severity ~ log(exposure.time), weights=freq, data=pneumo2,
> + family=cumulative(parallel=T))
>
> Coefficients:
> (Intercept):1 (Intercept):2 log(exposure.time)
> -1.6306499 2.5630170 -0.1937956
>
> Degrees of Freedom: 48 Total; 45 Residual
> Residual Deviance: 503.7251
> Log-likelihood: -251.8626
> Warning messages:
> 1: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
> trace, wzeps = control$wzepsilon)
> 2: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
> trace, wzeps = control$wzepsilon)
> 3: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
> trace, wzeps = control$wzepsilon)
> 4: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
> trace, wzeps = control$wzepsilon)
> 5: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
> trace, wzeps = control$wzepsilon)
>
> # Analyzing the modified dataset with polr reproduces these second
> results
>
>> polr(severity ~ log(exposure.time), weights=freq, data=pneumo2)
>
> Coefficients:
> log(exposure.time)
> 0.1938154
>
> Intercepts:
> mild|normal normal|severe
> -1.630612 2.563055
>
> Residual Deviance: 503.7251
> AIC: 509.7251
>
> ------
>
> Can someone explain what I am doing wrong when using vglm and polr
with
> the modified dataset? I do not understand why these two formulations
> should give different results.
>
> Brant Inman
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list