[R] problems with optim, "for"-loops and machine precision

Ken Beath kbeath at efs.mq.edu.au
Wed Jan 10 23:34:10 CET 2007


Two possibilities for why your 7 parameter model fits worse than the 6
are that you are finding a local maximum, which might suggest using a
different parameterisation or the functions are accessing some global
data and so aren't behaving as expected. This could be why they work
properly when run on their own.

I would also check what happens if convergence fails for the 7 parameter
model, in your code this isn't handled well. Also if the constraint on
parameters of >=0 is actually >0, it may be better to work with
parameters on the log scale, avoiding the constraints.

I have found with simulations it is useful to save the fitted objects,
so they can be inspected later, or for the parameters to be extracted
after the models are fitted. This method allows restarting in case of
computer crashes.

Ken

>>> "Simon Ruegg" <s.ruegg at access.unizh.ch> 01/10/07 11:18 PM >>>
Dear R experts,

 

I have been encountering problems with the "optim" routine using "for"
loops. I am determining the optimal parameters of several nested models
by
minimizing the negative Log-Likelihood (NLL) of a dataset. 

 

The aim is to find the model which best describes the data. To this end,
I
am simulating artificial data sets based on the model with the least
number
of parameters (6) and the parameters determined with the field data. For
each artificial set I estimate the parameters of the model with 6
parameters
and the next more complex model with 7 parameters (two of these
parameters
are equal in the 6-parameter model) by minimizing the corresponding NLL
with
"optim". In theory the 7-parameter model should fit the data either
equally
or better than the 6-parameter model. Therefore the difference of the
minimal NLLs should be 0 or larger.

For 500 data sets I use the following code:

 

require(odesolve)

res=matrix(nrow=500,ncol=18)

colnames(res)=c("a_23","beta_23","mu_23","d_23","I_23","M_23","NLL_23",

           
"a_21","beta_21","mu_21","e_21","d_21","I_21","M_21","NLL_21",

            "NLL23_min_21","conv23","conv21")

for (s in 1:500)

{

 
assign("data",read.table(paste("populations/TE23simset_",s,".txt",sep="")),e
nv=MyEnv) #reading a data set

 

  M23=optim(rep(0.1,6),NLL23,method="L-BFGS-B",lower=0,

      upper=c(Inf,Inf,Inf,Inf,1,1),control=c(maxit=150))

  if (M23$convergence==0)

    {

       M21=optim(rep(0.1,7),NLL21,method="L-BFGS-B",lower=0,

            upper=c(Inf,Inf,Inf,Inf,Inf,1,1),control=c(maxit=150))

    }

      res[s,1]=M23$par[1]

      res[s,2]=M23$par[2]

      res[s,3]=M23$par[3]

      res[s,4]=M23$par[4]

      res[s,5]=M23$par[5]

      res[s,6]=M23$par[6]

      res[s,7]=M23$value

      res[s,8]=M21$par[1]

      res[s,9]=M21$par[2]

      res[s,10]=M21$par[3]

      res[s,11]=M21$par[4]

      res[s,12]=M21$par[5]

      res[s,13]=M21$par[6]

      res[s,14]=M21$par[7]

      res[s,15]=M21$value

      res[s,16]=M23$value-M21$value

      res[s,17]=M23$convergence

      res[s,18]=M21$convergence

      write.table(res,"compare23_21TE061221.txt")

      rm(M23,M21)

   }

}

 

For some strange reason the results do not correspond to what I expect:
about 10% of the solutions have a difference of NLL smaller than 0. I
have
verified the optimisation of these results manually and found that a
minimal
NLL was ignored and a higher NLL was returned at "convergence". To check
what was happening I inserted a printing line in the NLL function to
print
all parameters and the NLL as the procedure goes on. To my surprise
"optim"
then stopped at the minimal NLL which had been ignored before. I have
then
reduced the machine precision to .Machine$double.digits=8 thinking, that
the
printing was slowing down the procedure and by reducing the machine
precision to speed up the calculations. For an individual calculation
this
solved my problem. However if I implemented the same procedure in the
loop
above, the same impossible results occurred again.

 

Can anyone tell me where I should be looking for the problem, or what it
is
and how I could solve it?

 

Thanks a lot for your help

 

 

Sincerely

 

Simon

 

 

 

********************************************************************

Simon Ruegg

Dr.med.vet.,  PhD candidate

Institute for Parasitology

Winterthurstr. 266a

8057 Zurich

Switzerland

 

phone: +41 44 635 85 93

fax: +41 44 635 89 07

e-mail: s.ruegg at access.unizh.ch

 


	[[alternative HTML version deleted]]

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list