[R] Subject: Re ZINB by Newton Raphson??

Tue Jun 22 22:18:12 CEST 2010

I have not included the previous postings because they came out very strangely on my mail 
reader. However, the question concerned the choice of minimizer for the zeroinfl() 
function, which apparently allows any of the current 6 methods of optim() for this 
purpose. The original poster wanted to use Newton-Raphson.

Newton-Raphson (or just Newton for simplicity) is commonly thought to be the "best" way to 
approach optimization problems. I've had several people ask me why the optimx() package 
(see OptimizeR project on r-forge -- probably soon on CRAN, we're just tidying up) does 
not have such a choice. Since the question comes up fairly frequently, here is a response. 
I caution that it is based on my experience and others may get different mileage.

My reasons for being cautious about Newton are as follows:
1) Newton's method needs a number of safeguards to avoid singular or indefinite Hessian 
issues. These can be tricky to implement well and to do so in a way that does not hinder 
the progress of the optimization.
2) One needs both gradient and Hessian information, and it needs to be accurate. Numerical 
approximations are slow and often inadequate for tough problems.
3) Far from a solution, Newton is often not very good, likely because the Hessian is not 
like a nice quadratic over the whole space.

Newton does VERY well at converging when it has a "close enough" start. If you can find an 
operationally useful way to generate such starts, you deserve awards like the Fields.

We have in our optimx work (Ravi Varadhan and I) developed a prototype safeguarded Newton. 
As yet we have not included it in optimx(), but probably will do so in a later version 
after we figure out what advice to give on where it is appropriate to apply it.

In the meantime, I would suggest that BFGS or L-BFGS-B are the closest options in optim() 
and generally perform quite well. There are updates to BFGS and CG on CRAN in the form of 
Rvmmin and Rcgmin which are all-R implementations with box constraints too. UCMINF is a 
very similar implementation of the unconstrained algorithm that seems to have the details 
done rather well -- though BFGS in optim() is based on my work, I actually find UCMINF 
often does better. There's also nlm and nlminb.

Via optimx() one can call these, and also some other minimizers, or even "all.methods", 
though that is meant for learning about methods rather than solving individual problems.

JN