[R] Effect of multiplying the parscale vector by a scalar

Fri May 12 08:32:43 CEST 2006

I noticed that the result of running optim() varies if the vector
passed as the parscale argument is multiplied by a scalar. For
example, only difference between the two code fragments below is that
in the second one parscale is ten times larger than in the first. The
optimal value and the optimal solution are quite different, though (in
the second case a much better solution was found, as the function is
being minimized).

Of course, multiplication by a very small scalar could lead to
problems related to rounding errors. But this is not what is happening
here, I believe. In the first example the rescaled optimal variables
are all around 1, within one order of magnitude.

I have also noticed that if one varies one (and only one) of the third
or fourth parameters from their original optimal values (0.9046640
and 0.9050617) towards 1 the function increases, but if one moves them
simultaneously and linearly towards (1, 1) then the function
decreases, even for very small steps. But optim does not seem to be
able to discover that direction of descent in the first example.

Any insights will be greatly appreciated.

Thanks.

FS

######################## First example ##########################
> phi.init <- c(0.002, 0.0002, 0.05, 0.9, 0.9, -0.3, 0.3, 0.3, -1)
> lo <- c(0,   0, 0, 0, 0, -Inf, 0, 0, -Inf)
> hi <- c(Inf, Inf, Inf, 1, 1, 0, Inf, Inf, 0)
> phi_ <- phi.init
>
> opt.time <- system.time(phi_opt <- optim(phi_, model_lik, NULL, method = "L-BFGS-B", lower = lo, upper = hi, control = list(maxit = 1000, parscale = c(0.005, 0.0001, 0.05, 1, 1, 0.1, 0.1, 0.3, 1), trace = 0, REPORT = 3), hessian = FALSE))[3]
>
> phi_opt
$par
[1]  0.0065395  0.0002001  0.0475511  0.9046640  0.9050617 -0.3166011
0.3050438  0.2868558 -0.9515073

$value
[1] -4.901

$counts
function gradient
      21       21

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"

######################## Second example ##########################

>
> phi.init <- c(0.002, 0.0002, 0.05, 0.9, 0.9, -0.3, 0.3, 0.3, -1)
> lo <- c(0,   0, 0, 0, 0, -Inf, 0, 0, -Inf)
> hi <- c(Inf, Inf, Inf, 1, 1, 0, Inf, Inf, 0)
> phi_ <- phi.init
>
> opt.time <- system.time(phi_opt <- optim(phi_, model_lik, NULL, method = "L-BFGS-B", lower = lo, upper = hi, control = list(maxit = 1000, parscale = 10 * c(0.005, 0.0001, 0.05, 1, 1, 0.1, 0.1, 0.3, 1), trace = 0, REPORT = 3), hessian = FALSE))[3]
>
> phi_opt
$par
[1]  0.001167  0.000200  0.000000  1.000000  1.000000 -0.296163
0.295126  0.280790  0.000000

$value
[1] -91.94

$counts
function gradient
      48       48

$convergence
[1] 0

$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"

>