[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Mon May 6 16:40:55 CEST 2019


  That's consistent/not surprising if the problem lies in the numerical
gradient calculation step ...

On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:
> Optim's Nelder-Mead works correctly for this example.
> 
> 
>> optim(par=10, fn=fn, method="Nelder-Mead")
> x=10, ret=100.02 (memory)
> x=11, ret=121 (calculate)
> x=9, ret=81 (calculate)
> x=8, ret=64 (calculate)
> x=6, ret=36 (calculate)
> x=4, ret=16 (calculate)
> x=0, ret=0 (calculate)
> x=-4, ret=16 (calculate)
> x=-4, ret=16 (memory)
> x=2, ret=4 (calculate)
> x=-2, ret=4 (calculate)
> x=1, ret=1 (calculate)
> x=-1, ret=1 (calculate)
> x=0.5, ret=0.25 (calculate)
> x=-0.5, ret=0.25 (calculate)
> x=0.25, ret=0.0625 (calculate)
> x=-0.25, ret=0.0625 (calculate)
> x=0.125, ret=0.015625 (calculate)
> x=-0.125, ret=0.015625 (calculate)
> x=0.0625, ret=0.00390625 (calculate)
> x=-0.0625, ret=0.00390625 (calculate)
> x=0.03125, ret=0.0009765625 (calculate)
> x=-0.03125, ret=0.0009765625 (calculate)
> x=0.015625, ret=0.0002441406 (calculate)
> x=-0.015625, ret=0.0002441406 (calculate)
> x=0.0078125, ret=6.103516e-05 (calculate)
> x=-0.0078125, ret=6.103516e-05 (calculate)
> x=0.00390625, ret=1.525879e-05 (calculate)
> x=-0.00390625, ret=1.525879e-05 (calculate)
> x=0.001953125, ret=3.814697e-06 (calculate)
> x=-0.001953125, ret=3.814697e-06 (calculate)
> x=0.0009765625, ret=9.536743e-07 (calculate)
> $par
> [1] 0
> 
> $value
> [1] 0
> 
> $counts
> function gradient
>       32       NA
> 
> $convergence
> [1] 0
> 
> $message
> NULL
> 
> 
> 
> 
> ________________________________
> From: R-devel <r-devel-bounces using r-project.org> on behalf of Duncan Murdoch <murdoch.duncan using gmail.com>
> Sent: Friday, May 3, 2019 8:18:44 AM
> To: peter dalgaard
> Cc: Florian Gerber; r-devel using r-project.org
> Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
> 
> 
> It looks as though this happens when calculating numerical gradients:  x
> is reduced by eps, and fn is called; then x is increased by eps, and fn
> is called again.  No check is made that x has other references after the
> first call to fn.
> 
> I'll put together a patch if nobody else gets there first...
> 
> Duncan Murdoch
> 
> On 03/05/2019 7:13 a.m., peter dalgaard wrote:
>> Yes, I think you are right. I was at first confused by the fact that after the optim() call,
>>
>>> environment(fn)$xx
>> [1] 10
>>> environment(fn)$ret
>> [1] 100.02
>>
>> so not 9.999, but this could come from x being assigned the final value without calling fn.
>>
>> -pd
>>
>>
>>> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>>
>>> Your results below make it look like a bug in optim():  it is not duplicating a value when it should, so changes to x affect xx as well.
>>>
>>> Duncan Murdoch
>>>
>>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
>>>> On 03/05/2019 10:31, Serguei Sokol wrote:
>>>>> On 02/05/2019 21:35, Florian Gerber wrote:
>>>>>> Dear all,
>>>>>>
>>>>>> when using optim() for a function that uses the parent environment, I
>>>>>> see the following unexpected behavior:
>>>>>>
>>>>>> makeFn <- function(){
>>>>>>       xx <- ret <- NA
>>>>>>       fn <- function(x){
>>>>>>          if(!is.na(xx) && x==xx){
>>>>>>              cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
>>>>>>              return(ret)
>>>>>>          }
>>>>>>          xx <<- x; ret <<- sum(x^2)
>>>>>>          cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
>>>>>>          ret
>>>>>>       }
>>>>>>       fn
>>>>>> }
>>>>>> fn <- makeFn()
>>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>>>> # x=10, ret=100 (calculate)
>>>>>> # x=10.001, ret=100.02 (calculate)
>>>>>> # x=9.999, ret=100.02 (memory)
>>>>>> # $par
>>>>>> # [1] 10
>>>>>> #
>>>>>> # $value
>>>>>> # [1] 100
>>>>>> # (...)
>>>>>>
>>>>>> I would expect that optim() does more than 3 function evaluations and
>>>>>> that the optimization converges to 0.
>>>>>>
>>>>>> Same problem with optim(par=10, fn=fn, method="BFGS").
>>>>>>
>>>>>> Any ideas?
>>>>> I don't have an answer but may be an insight. For some mysterious
>>>>> reason xx is getting changed when in should not. Consider:
>>>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
>>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
>>>>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>>> 1 in x,xx,ret= 10 NA NA
>>>>> out x,xx,ret= 10 10 100
>>>>> 2 in x,xx,ret= 10.001 10 100
>>>>> out x,xx,ret= 10.001 10.001 100.02
>>>>> 3 in x,xx,ret= 9.999 9.999 100.02
>>>>> $par
>>>>> [1] 10
>>>>>
>>>>> $value
>>>>> [1] 100
>>>>>
>>>>> $counts
>>>>> function gradient
>>>>>         1        1
>>>>>
>>>>> $convergence
>>>>> [1] 0
>>>>>
>>>>> $message
>>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>>>>
>>>>> At the third call, xx has value 9.999 while it should have kept the
>>>>> value 10.001.
>>>>>
>>>> A little follow-up: if you untie the link between xx and x by replacing
>>>> the expression "xx <<- x" by "xx <<- x+0" it works as expected:
>>>>   > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
>>>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>>>>   > optim(par=10, fn=fn, method="L-BFGS-B")
>>>> 1 in x,xx,ret= 10 NA NA
>>>> out x,xx,ret= 10 10 100
>>>> 2 in x,xx,ret= 10.001 10 100
>>>> out x,xx,ret= 10.001 10.001 100.02
>>>> 3 in x,xx,ret= 9.999 10.001 100.02
>>>> out x,xx,ret= 9.999 9.999 99.98
>>>> 4 in x,xx,ret= 9 9.999 99.98
>>>> out x,xx,ret= 9 9 81
>>>> 5 in x,xx,ret= 9.001 9 81
>>>> out x,xx,ret= 9.001 9.001 81.018
>>>> 6 in x,xx,ret= 8.999 9.001 81.018
>>>> out x,xx,ret= 8.999 8.999 80.982
>>>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982
>>>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
>>>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
>>>> out x,xx,ret= 0.001 0.001 1e-06
>>>> 9 in x,xx,ret= -0.001 0.001 1e-06
>>>> out x,xx,ret= -0.001 -0.001 1e-06
>>>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
>>>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
>>>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
>>>> out x,xx,ret= 0.001 0.001 1e-06
>>>> 12 in x,xx,ret= -0.001 0.001 1e-06
>>>> out x,xx,ret= -0.001 -0.001 1e-06
>>>> $par
>>>> [1] -1.334475e-23
>>>> $value
>>>> [1] 1.780823e-46
>>>> $counts
>>>> function gradient
>>>>          4        4
>>>> $convergence
>>>> [1] 0
>>>> $message
>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>>> Serguei.
>>>> ______________________________________________
>>>> R-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list