[R] Passing args and data to optim. WAS: Re: vectorized mle / optim

Wed Oct 24 20:49:30 CEST 2007

On Wed, 24 Oct 2007, Arnaud Le Rouzic wrote:

> Hi the list,
>
> I would need some advice on something that looks like a FAQ: the
> possibility of providing vectors to optim() function.
>

Arnaud,

If I have misunderstood your intent, let me say I am sorry.

I think what you are struggling with is this:

 	Providing data and parameter values to an objective function to be
 	used by optim in a flexible way.

I have experienced some frustration in this myself.

One solution I have used is to write the objective function in a natural 
way - using multiple arguments as needed for convenience and clarity.

Then I write a wrapper function that optim can handle that takes only a 
vector argument. If I need to pass data or fixed values of parameters, I 
do this by assigning them in the environment of the wrapper function.

Section 1.7 of Intro to R has an example ('open.account') of a nifty trick 
that is instructive and helpful for this purpose; you may want to review 
it.

In one project, I found it helpful to write a function that allows me to 
create wrappers and the underlying objective function and gradient in a 
flexible way. It is not well documented, can I send this and an example to 
you off-list if you like.

HTH,

Chuck

> Here is a stupid and short example summarizing the problem:
>
> -------------------------------- example 1 ------------ 8<
> ----------------------
> library(stats4)
> data <- rnorm(100,0,1)
> lik1 <- function(m, v, data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T)
>  lik.var <- dchisq(N*var(data)/v, N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result <- mle(lik1, start=list(m=2, v=2), fixed=list(data=data))
> summary(ml.result)
> ---------------------------------------------------------------------------------------
>
> This works perfectly (except that the default algorithm sometimes tries
> some negative variance parameters).
>
> However, if the parameters (m and v) are considered as a vector of
> parameters, the result is very disappointing:
>
> -------------------------------- example 2 ------------ 8<
> ----------------------
> lik2 <- function(param, data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), param["m"], sqrt(param["v"]/N), log=T)
>  lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result <- mle(lik2, start=list(param=c(m=2, v=2)), fixed=list(data=data))
> ---------------------------------------------------------------------------------------
> "Error in dnorm(mean(data), param["m"], sqrt(param["v"]/N), log = T) :
>  argument "param" is missing, with no default"
>
> One could trust the error message and provide default values, but
> unfortunately,
>
> -------------------------------- example 2b ------------ 8<
> ----------------------
> lik2b <- function(param=c(m=1, v=1), data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), param["m"], sqrt(param["v"]/N), log=T)
>  lik.var <- dchisq(N*var(data)/param["v"], N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result <- mle(lik2b, start=list(param=c(m=2, v=2)),
> fixed=list(data=data))
> ---------------------------------------------------------------------------------------
> "Error in validObject(.Object) :
>  invalid class "mle" object: invalid object for slot "fullcoef" in
> class "mle": got class "list", should be or extend class "numeric""
>
> The example above is very stupid, but there are cases where the estimate
> of vectors of parameters can hardly be avoided. In particular, I'm
> working with time series models with random effects that has to be
> estimated at each time point (the parameters underlying the distribution
> of these random effects are "real" parameters, but the effect itself is
> a nuisance parameter). Of course, the number of variables to estimate
> will depend on the size of the data set, and the function is supposed to
> work for any dataset (convergence is another issue).
>
> The archives of r-help are quite clear on the fact that mle (and optim)
> are not vectorized, dot. I tried to trick mle by defining a likelihood
> function with a "..." parameter, and converting the c(...) into named
> parameters afterwards, but it does not work: mle checks the list of
> parameters against "fullcoef <- formals(minuslogl)". I derived my own
> mle function to force the call of optim() without the "fool-proof"
> tests, but it was probably very naive because it crashes as well.
>
> The fact is that the code of mle(), as well as the R part of optim(), is
> full of (not always clean) tricks to ensure that the parameters are
> exactly as they are supposed to be. I guess the aim of this code is not
> only to slow down the function and to annoy the user, so I would like to
> get some feedback before trying to modify mle() and optim() to force
> them, in one way or another, to turn lists of vectors into lists of
> numeric values. I'm a bit afraid that the problem goes down to the
> .Internal optim function, and I will probably give up if it is the case.
>
> Thanks for any comment,
>
> Arnaud.
>
>
>
> PS: By the way, the "paranoid parameter tests" in mle / optim lead to an
> annoying bug: the "minuslogl" function cannot have "computed" default
> parameters, because all of them must be specified either as "fixed" or
> "start".
>
> -------------------------------- example 3 ------------ 8<
> ----------------------
> lik1b <- function(m, v=var(data), data) {
>  N <- length(data)
>  lik.mean <- dnorm(mean(data), m, sqrt(v/N), log=T)
>  lik.var <- dchisq(N*var(data)/v, N-1, log=T)
>  return(-lik.mean - lik.var)
> }
> ml.result.1 <- mle(lik1b, start=list(m=2, v=2), fixed=list(data=data))
> ml.result.2 <- mle(lik1b, start=list(m=2), fixed=list(data=data))
> ---------------------------------------------------------------------------------------
>
> The second call to mle crashes
> "Error in validObject(.Object) :
>  invalid class "mle" object: invalid object for slot "fullcoef" in
> class "mle": got class "list", should be or extend class "numeric""
>
> I don't see any reason for that: a call to lik1b(m=0, data=data) returns
> the expected value. The bug can be bypassed by using a default such as
> v=NULL, and then if(is.null(v)) v <- var(data) , but it does not seem
> very elegant...
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901