[R] optim() for ordered logit model with parallel regression assumption

Wed Aug 1 04:07:48 CEST 2012

On Tue, Jul 31, 2012 at 7:57 PM, Xu Jun <junxu.r at gmail.com> wrote:
> Dear R listers,
>
> I am learning the MLE utility optim() in R to program ordered logit
> models just as an exercise. See below I have three independent
> variables, x1, x2, and x3. Y is coded as ordinal from 1 to 4. Y is not
> yet a factor variable here. The ordered logit model satisfies the
> parallel regression assumption. The following codes can run through,
> but results were totally different from what I got using the polr
> function from the MASS package. I think it might be due to the way how
> the p is constructed in the ologit.lf function. I am relatively new to
> R, and here I would guess probably something related to the class type
> (numeric vs. matrix) or something along that line among those if
> conditions. Thanks in advance for any suggestion.
>
> Jun Xu, PhD
> Assistant Professor
> Department of Sociology
> Ball State University
> Muncie, IN 47306
>
>
> ####################################################################
>
> library(foreign)
> readin <- read.dta("ordfile.dta", convert.factors=FALSE)
> myvars <- c("depvar", "x1", "x2", "x3")
> mydta <- readin[myvars]
> # remove all missings
> mydta <- na.omit(mydta)
>
> # theta is the parameter vector
> ologit.lf <- function(theta, y, X) {
>   n <- nrow(X)
>   k <- ncol(X)
> # b is the coefficient vector for independent variables
>   b <- theta[1:k]
> # tau1 is cut-point 1
>   tau1 <- theta [k+1]
> # tau2 is cut-point 2
>   tau2 <- theta [k+2]
> # tau3 is cut-point 1
>   tau3 <- theta [k+3]
>
>   if (y == 1){
>     p <- (1/(1+exp( - tau1 + X %*% b)))
>   }
>   if (y == 2) {
>     p <- (1/(1+exp( - tau2 + X %*% b))) - (1/(1+exp( - tau1 + X %*% b)))
>   }
>   if (y == 3) {
>     p <- (1/(1+exp( - tau3 + X %*% b))) - (1/(1+exp( - tau2 + X %*% b)))
>   }
>   if (y == 4) {
>     p <- 1 - (1/(1+exp( - tau3 + X %*% b)))
>   }
>   - sum(p)
> }
>
> y <- as.numeric(mydta$depvar)
> X <- cbind (mydta$x1, mydta$x2, mydta$x3)
> runopt <- optim(rep(0, ncol(X)+4), ologit.lf, method="BFGS",
> hessian=T, y=y, X=X)
>
>
> There were 50 or more warnings (use warnings() to see the first 50)
>> warnings()
>
> 1: In if (y == 1) { ... :
>   the condition has length > 1 and only the first element will be used
> 2: In if (y == 2) { ... :
>   the condition has length > 1 and only the first element will be used

It looks like you've got a fundamental problem in your if/else
statements. if and else are control structures and so they operate on
the whole program flow -- I think you want the ifelse() function here
instead.

Take a look at this example:

x <- c(1, 5, 9)

if(x < 3) {y <- x^2} else {y <- 2}

z <- ifelse(x < 3, x^2, 2)

print(x)
print(y)
print(z)

See also ?ifelse

Best,
Michael

>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.