[Rd] Minor glitch in optim()

Martin Maechler maechler at stat.math.ethz.ch
Fri Apr 20 09:55:44 CEST 2018

>>>>> J C Nash <profjcnash at gmail.com>
>>>>>     on Tue, 17 Apr 2018 10:18:02 -0400 writes:

    > Having worked with optim() and related programs for years, it surprised me
    > that I haven't noticed this before, but optim() is inconsistent in how it
    > deals with bounds constraints specified at infinity. Here's an example:

    # optim-glitch-Ex.R
    x0 <- c(1,2,3,4)
    fnt <- function(x, fscale=10){
      yy <- length(x):1
      val <- sum((yy*x)^2)*fscale
    grt <- function(x, fscale=10){
      nn <- length(x)
      yy <- nn:1
      #    gg <- rep(NA,nn)
      gg <- 2*(yy^2)*x*fscale

    npar <- 4
    lower <- -Inf
    l2 <- rep(-Inf,npar)

    a1 <- optim(x0, fnt, grt, lower=lower, method="BFGS") # works
    a1a<- optim(x0, fnt, grt, lower=l2, method="BFGS") # changes method!

> The first call uses BFGS method without warning. The second gives
> a warning that L-BFGS-B should be used, and from the output uses
> this.

> This is a bit of an edge case. My own preference would be for optim()
> to simply fail if bounds of any type are specified without L-BFGS-B
> as the method. I believe that gives clarity, even though infinite
> bounds imply an unconstrained problem.

> The behaviour where a scalar infinite bound is treated as unconstrained
> but a vector is not is inconsistent, however, and I think that at some
> point should be fixed.

I agree that it inconsistent.  The help page mentions that
"non-trivial" bounds are treated like that, and if you'd
consider  lower = c(-Inf, -Inf) to be  "non-trivial",  one could
say optim() behaved according to documentation and non-buggy,
but a much more reasonable definition of non-trivial bounds are
bounds that are not equivalent to (lower=-Inf, upper=+Inf)  and
the vector versions *are* equivalent.

So I agree about fixing.

> point should be fixed. Possibly the easiest way is to treat infinite
> bounds specified as a vector the same as those specified as a scalar.

I agree.  

> That is to adjust the code in File src/library/stats/R/optim.R
> in the block

    > if((length(lower) > 1L || length(upper) > 1L ||
    >    lower[1L] != -Inf || upper[1L] != Inf)
    >    && !any(method == c("L-BFGS-B","Brent"))) {
    > 	warning("bounds can only be used with method L-BFGS-B (or Brent)")
    > 	method <- "L-BFGS-B"
    > }

> Possibly

    > if((any(is.finite(lower) || any(is.finite(upper))
    >    && !any(method == c("L-BFGS-B","Brent"))) {
    > 	warning("bounds can only be used with method L-BFGS-B (or Brent)")
    > 	method <- "L-BFGS-B"
    > }

> Best, JN

I aim to go for the first line

    if((any(lower > -Inf)) || any(upper < Inf))

which is 100% correct and easier to "read", even though your
version -- after adding the missing ')' -- is faster
 and almost always equivalent.

Running checks now.  Wont' be in R 3.5.0 of course,  but then
probably in  R 3.5.0 patched.

Martin Maechler
ETH Zurich

More information about the R-devel mailing list