[R] Logistic Regression using optim() give "L-BFGS-B" error, please help

ashky ashokkrish at gmail.com
Tue Sep 30 01:57:13 CEST 2008


Sorry, I deleted my old post. Pasting the new query below.

Dear R Users/Experts, 

I am using a function called logitreg() originally described in MASS (the
book 4th Ed.) by Venebles & Ripley, p445. I used the code as provided but
made couple of changes to run a 'constrained' logistic regression, I set the
method = "L-BFGS-B", set lower/upper values for the variables. 

Here is the function, 

logitregVR <- function(x, y, wt = rep(1, length(y)), intercept = T, start =
rep(0, p), ...)
{
#-------------------------------------------#
# A function to be minimized (or maximized) #
#-------------------------------------------#
fmin <- function(beta, X, y, w) 
{
  p <- plogis(X %*% beta)
  #p <- ifelse(1-p < 1e-6, 1-1e-6, p) 
 cat("----------", "\n")
 cat(paste("X = "), X, "\n")
 cat(paste("beta = "), beta, "\n")
 cat(paste("p = "), p, "\n")
 cat(paste("1-p = "), 1-p, "\n")
 cat(paste("y = "), y, "\n")
 cat(paste("length(p) ="), length(p), "\n")
 cat(paste("class(p) ="), class(p), "\n")
 cat("----------", "\n")

 -sum(2*w*ifelse(y, log(p), log(1-p))) # Residual Deviance: D =
-2[Likelihood Ratio]
 print(-sum(2*w*ifelse(y, log(p), log(1-p))))
}

#----------------------------------------------------------------------#
# A function to return the gradient for the "BFGS", "L-BFGS-B" methods #
#----------------------------------------------------------------------#
gmin <- function(beta, X, y, w) 
{
  eta <- X %*% beta; p <- plogis(eta)
  #p <- ifelse(1-p < 1e-6, 1-1e-6, p) 
  -2*matrix(w*dlogis(eta)*ifelse(y, 1/p, -1/(1-p)), 1) %*% X # Gradient
  print(-2*matrix(w*dlogis(eta)*ifelse(y, 1/p, -1/(1-p)), 1) %*% X)
}

  if(is.null(dim(x))) dim(x) <- c(length(x), 1)
  dn <- dimnames(x)[[2]]
  if(!length(dn)) dn <- paste("Slope", 1:ncol(x), sep="")
  p <- ncol(x) + intercept # 1 + 1
  if(intercept) {x <- cbind(1, x); dn <- c("(Intercept)", dn)}
  if(is.factor(y)) y <- (unclass(y) != 1)

#fit <- optim(start, fmin, gmin, X = x, y = y, w = wt, method = "L-BFGS-B",
lower = c(-Inf, 0.05), upper = c(Inf, Inf), ...)
fit <- optim(start, fmin, gmin, X = x, y = y, w = wt, control = list(trace =
6), method = "L-BFGS-B", lower = c(-Inf, 0.05), upper = c(Inf, Inf), ...)

  names(fit$par) <- dn
  cat("\nCoefficients:\n"); print(fit$par)
  cat("\nResidual Deviance:", format(fit$value), "\n")
  if(fit$convergence > 0) cat("\nConvergence code:", fit$convergence, "\n")
  return(list(fitpar = fit$par))
  invisible(fit)
}

And here is my data, 

# ---------------------- 
# Dose 0 1 emp.prop 
# ---------------------- 
# 1 3 0 0.000 
# 2 1 0 0.000 
# 3.3 4 0 0.000 
# 5.016 3 0 0.000 
# 7.0224 4 0 0.000 
# 9.3398 2 0 0.000 
# 12.4219 2 0 0.000 
# 16.5211 47 1 0.021 
# 21.9731 1 1 0.500 
# ---------------------- 

I have used the glm(family = binomial) and the lrm() function. But my
interest is to constraint the slope parameter (beta) to some specified value
(say, a non negative number, 0.05 for instance). If you notice in the above
code by Venebles and Ripley, for the optim() part I chose method =
"L-BFGS-B" and set lower and upper values for my two parameters. Here is the
code and the error I am getting, 

y <- c(rep(0,3), rep(0,1), rep(0,4), rep(0,3), rep(0,4), rep(0,2), rep(0,2),
rep(0,47), rep(1,1), rep(0,1), rep(1,1)) 
x <- c(rep(1,3), rep(2,1), rep(3.3,4), rep(5.016,3), rep(7.0224,4),
rep(9.3398,2), rep(12.4219,2), rep(16.5211,48), rep(21.9731,2)) 

length(x) 
length(y) 

#------------------# 
# Method 1 - Works # 
#------------------# 
glm(y~x, family = binomial)$coefficients 

# summary(glm(y~x, family = binomial)) 

#------------------# 
# Method 2 - Works # 
#------------------# 
  library(Design) 
  lrm(y ~ x)$coef 

#-------------------# 
# Method 3 - Works # 
#-------------------# 
lgtreg <- function(beta, x, y) 
{ 
       eta <- exp(beta[1] + beta[2]*x) 
       p <- eta/(1+eta) 
       return(-sum(ifelse(y,log(p),log(1-p)))) # This is the log-likelihood 
} 

optim(c(0,0), lgtreg, x = x, y = y, method = "BFGS", hessian = TRUE)$par 

#-----------------------------------------------------------------# 
# Method 4 - Veneble and Ripley's Method with different start values # 
#-----------------------------------------------------------------# 

logitregVR(x, y) # No error 
logitregVR(x, y, start = rep(0.1, 2)) # Gives error 
logitregVR(x, y, start = rep(0.5, 2)) # No error 
logitregVR(x, y, start = rep(1, 2)) # Gives error 

I am getting this error, 

Error in optim(start, fmin, gmin, X = x, y = y, w = wt, method = "L-BFGS-B",
: L-BFGS-B needs finite values of 'fn'

If you see, at each iteration I printed the X, beta, p, Deviance etc. But
for certain start values this function is failing on me and giving an
infinite value for Deviance. Can someone please help me and tell what I am
doing wrong.

Any comments/replies highly appreciated, 

Thank you very much,
Ashky
-- 
View this message in context: http://www.nabble.com/Logistic-Regression-using-optim%28%29-give-%22L-BFGS-B%22-error%2C-please-help-tp19733974p19733974.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list