[R] Probit and optim in R

Mon Sep 8 22:35:20 CEST 2003

I have had some weird results using the optim() function.  I wrote a
probit likelihood and wanted to run it with optim() with simulated
data.  I did not include a gradient at first and found that optim()
would not even iterate using BFGS and would only occasionally work
using SANN.  I programmed in the gradient and it iterates fine but the
estimates it returns are wrong.  The simulated data work fine when I
use them in the glm function in R.  And even stranger the function
seems to work fine with real data.  There must be some programming
quirk that I have overlooked.  My code is attached if anyone wants to
take a look.  I'd like to be able to make it work without the gradient
if possible.  Any help would be greatly appreciated as I am completely
stumped at this point.

Luke Keele
Dept of Political Science
UNC-Chapel Hill
-------------- next part --------------
## Probit Code For Simulation

#Define empty matrix 

na <- matrix(NA,10,3)

#Set counter and start loop

j <- 1

while (j < 11){

#Simulate Data

x1 <- rnorm(1000) 
x2 <- rnorm(1000)
latentz <- 1.0 + 2.0 * x1 - 3.0 * x2 + rnorm(1000)
y <- latentz
y[latentz < 1] <- 0
y[latentz >=1] <- 1

#Option export of data to check estimates in STATA
#SimProbit <- data.frame(y, x1, x2)
#write.table(SimProbit, 'a:/SimProbit')

x <- cbind(1, x1 ,x2)

#Define Likelihood
llik.probit<-function(par, X, Y){

Y <- as.matrix(y)
X <- as.matrix(x)
K <- ncol(X)
b <-as.matrix(par[1:K])

  phi<-pnorm(X%*%b, mean=0, sd=1, lower.tail = TRUE, log.p = FALSE)
  sum(Y*log(phi)+(1-Y)*log(1-phi))
}

grad <-function(par,X,Y){

Y <- as.matrix(y)
X <- as.matrix(x)
K <- ncol(X)
b <-as.matrix(par[1:K])
phi <- as.vector(pnorm(X%*%b))
     apply((phi*x), 2, sum)
}

#Fit Model
values.start <- lm(y ~ x1 + x2)$coef
result<-optim(values.start, llik.probit, gr=grad, Y=Y, X=X, method="BFGS",
                       control=list(maxit=2000, fnscale=-1, trace), hessian=T)

na[j,] <-result$par

j <- j+1
}

## Probit with Real Data

#Define Data

data<-read.table("c:/mydocu~1/harvar~1/snct.dat", header=T)
y <- data$MIL
x <- cbind(rep(1, length(y)), data$COOP, data$TARGET, data$COST, data$RELAT)

#Form Likelihood
llik.probit<-function(par, X, Y){

Y <- as.matrix(y)
X <- as.matrix(x)
K <- ncol(X)
b <- as.matrix(par[1:K])

  phi<-pnorm(X%*%b)
  sum(Y*log(phi)+(1-Y)*log(1-phi))
}

#Fit Model

values.start <- lm(MIL ~ COOP + TARGET + COST + RELAT, data=data)$coef
result<-optim(values.start, llik.probit, Y=y, X=x, method="BFGS",
                       control=list(fnscale=-1), hessian=T)

result$par

#This works too

data <- data.frame(y, x1, x2)
Y <-data$y
X <- cbind(rep(1,length(y)), data$x1, data$x2)

test <- glm(y ~ x1 + x2, family=binomial(probit), data=data)