[Rd] on the optim function

Fri Aug 6 16:48:04 CEST 2010

Christophe,

This is a good question. It is conventional that most classical optimization
schemes (e.g. gradient-descent, quasi-Newton, and Newton-type) methods do
not return the "iteration" counter, but they do return the number of times
the gradient is computed.  These two are equivalent in the sense that the
gradient is evaluated once and only once during each iteration.  However,
the optimization codes always return the number of times the objective
function is evaluated.  This is widely considered to be the most appropriate
metric for measuring the computational effort of the algorithm.  The reason
is this.  The actual step length of an algorithm for each iteration depends
upon a "safe-guard" rule such as a line-search or a trust-region approach.
These safe-guard rules might evaluate the function multiple times to ensure
that the step length is "safe", for example, that it results in a reduction
of the function value.  Therefore, just monitoring the number of iterations
does not give you an accurate picture of the computational effort expended
by the algorithm.

Another point worth noting is that when an analytic gradient function is not
specified by the user, most (if not all) gradient-based algorithms use a
numerical approximation for the gradient.  This is typically done using a
first-order forward difference scheme.  This involves `p' evaluations of the
objective function, when you have a p-dimensional parameter vector to be
optimized.  This greatly increases the computational effort, much more than
that involved in safe-guarding.  Therefore, it is always a good idea to
specify analytic gradient, whenever possible.

Here is an example (based on that from optim's help page) that illustrates
the equivalency of # iterations and gradient count:

> optim(c(-1.2,1), fr, grr, method="CG", control=list(maxit=50))
$par
[1] -0.9083598  0.8345932

$value
[1] 3.636803

$counts
function gradient 
     202       51 

$convergence
[1] 1

$message
NULL

> optim(c(-1.2,1), fr, method="CG", control=list(maxit=75))$counts
function gradient 
     304       76 
>
Note that when I specified `maxit = 50', I got gradient counts = 51, and
when I specified maxit=75, I got gradient counts = 76.

But look at the difference in function counts.  There is 102 more function
evals, even though I did only 25 more iterations.  This tells you that the
function is evaluated more than once during an iteration.

The above is also true for BFGS, L-BFGS-B.  You can check it for yourself.

Hope this helps,
Ravi.

-----Original Message-----
From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
On Behalf Of Christophe Dutang
Sent: Friday, August 06, 2010 5:15 AM
To: r-devel at r-project.org
Subject: [Rd] on the optim function

Dear useRs,

I have just discovered that the R optim function does not return the number
of iterations. 

I still wonder why line 632-634 of optim C, the iter variable is not
returned (for the BFGS method for example) ? 

Is there any trick to compute the iteration number with function call
number?

Kind regards

Christophe

--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel