[Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

Mon May 6 18:21:04 CEST 2019

It seems that it's an old bug that was found in some other packages, but 
at that time not optim:

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958

and that Duncan Murdoch posted a patch already last Friday :)

Thomas

Am 06.05.2019 um 16:40 schrieb Ben Bolker:
>    That's consistent/not surprising if the problem lies in the numerical
> gradient calculation step ...
>
> On 2019-05-06 10:06 a.m., Ravi Varadhan wrote:
>> Optim's Nelder-Mead works correctly for this example.
>>
>>
>>> optim(par=10, fn=fn, method="Nelder-Mead")
>> x=10, ret=100.02 (memory)
>> x=11, ret=121 (calculate)
>> x=9, ret=81 (calculate)
>> x=8, ret=64 (calculate)
>> x=6, ret=36 (calculate)
>> x=4, ret=16 (calculate)
>> x=0, ret=0 (calculate)
>> x=-4, ret=16 (calculate)
>> x=-4, ret=16 (memory)
>> x=2, ret=4 (calculate)
>> x=-2, ret=4 (calculate)
>> x=1, ret=1 (calculate)
>> x=-1, ret=1 (calculate)
>> x=0.5, ret=0.25 (calculate)
>> x=-0.5, ret=0.25 (calculate)
>> x=0.25, ret=0.0625 (calculate)
>> x=-0.25, ret=0.0625 (calculate)
>> x=0.125, ret=0.015625 (calculate)
>> x=-0.125, ret=0.015625 (calculate)
>> x=0.0625, ret=0.00390625 (calculate)
>> x=-0.0625, ret=0.00390625 (calculate)
>> x=0.03125, ret=0.0009765625 (calculate)
>> x=-0.03125, ret=0.0009765625 (calculate)
>> x=0.015625, ret=0.0002441406 (calculate)
>> x=-0.015625, ret=0.0002441406 (calculate)
>> x=0.0078125, ret=6.103516e-05 (calculate)
>> x=-0.0078125, ret=6.103516e-05 (calculate)
>> x=0.00390625, ret=1.525879e-05 (calculate)
>> x=-0.00390625, ret=1.525879e-05 (calculate)
>> x=0.001953125, ret=3.814697e-06 (calculate)
>> x=-0.001953125, ret=3.814697e-06 (calculate)
>> x=0.0009765625, ret=9.536743e-07 (calculate)
>> $par
>> [1] 0
>>
>> $value
>> [1] 0
>>
>> $counts
>> function gradient
>>        32       NA
>>
>> $convergence
>> [1] 0
>>
>> $message
>> NULL
>>
>>
>>
>>
>> ________________________________
>> From: R-devel <r-devel-bounces using r-project.org> on behalf of Duncan Murdoch <murdoch.duncan using gmail.com>
>> Sent: Friday, May 3, 2019 8:18:44 AM
>> To: peter dalgaard
>> Cc: Florian Gerber; r-devel using r-project.org
>> Subject: Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments
>>
>>
>> It looks as though this happens when calculating numerical gradients:  x
>> is reduced by eps, and fn is called; then x is increased by eps, and fn
>> is called again.  No check is made that x has other references after the
>> first call to fn.
>>
>> I'll put together a patch if nobody else gets there first...
>>
>> Duncan Murdoch
>>
>> On 03/05/2019 7:13 a.m., peter dalgaard wrote:
>>> Yes, I think you are right. I was at first confused by the fact that after the optim() call,
>>>
>>>> environment(fn)$xx
>>> [1] 10
>>>> environment(fn)$ret
>>> [1] 100.02
>>>
>>> so not 9.999, but this could come from x being assigned the final value without calling fn.
>>>
>>> -pd
>>>
>>>
>>>> On 3 May 2019, at 11:58 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>>>>
>>>> Your results below make it look like a bug in optim():  it is not duplicating a value when it should, so changes to x affect xx as well.
>>>>
>>>> Duncan Murdoch
>>>>
>>>> On 03/05/2019 4:41 a.m., Serguei Sokol wrote:
>>>>> On 03/05/2019 10:31, Serguei Sokol wrote:
>>>>>> On 02/05/2019 21:35, Florian Gerber wrote:
>>>>>>> Dear all,
>>>>>>>
>>>>>>> when using optim() for a function that uses the parent environment, I
>>>>>>> see the following unexpected behavior:
>>>>>>>
>>>>>>> makeFn <- function(){
>>>>>>>        xx <- ret <- NA
>>>>>>>        fn <- function(x){
>>>>>>>           if(!is.na(xx) && x==xx){
>>>>>>>               cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
>>>>>>>               return(ret)
>>>>>>>           }
>>>>>>>           xx <<- x; ret <<- sum(x^2)
>>>>>>>           cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
>>>>>>>           ret
>>>>>>>        }
>>>>>>>        fn
>>>>>>> }
>>>>>>> fn <- makeFn()
>>>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>>>>> # x=10, ret=100 (calculate)
>>>>>>> # x=10.001, ret=100.02 (calculate)
>>>>>>> # x=9.999, ret=100.02 (memory)
>>>>>>> # $par
>>>>>>> # [1] 10
>>>>>>> #
>>>>>>> # $value
>>>>>>> # [1] 100
>>>>>>> # (...)
>>>>>>>
>>>>>>> I would expect that optim() does more than 3 function evaluations and
>>>>>>> that the optimization converges to 0.
>>>>>>>
>>>>>>> Same problem with optim(par=10, fn=fn, method="BFGS").
>>>>>>>
>>>>>>> Any ideas?
>>>>>> I don't have an answer but may be an insight. For some mysterious
>>>>>> reason xx is getting changed when in should not. Consider:
>>>>>>> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
>>>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx
>>>>>> <<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>>>>>>> optim(par=10, fn=fn, method="L-BFGS-B")
>>>>>> 1 in x,xx,ret= 10 NA NA
>>>>>> out x,xx,ret= 10 10 100
>>>>>> 2 in x,xx,ret= 10.001 10 100
>>>>>> out x,xx,ret= 10.001 10.001 100.02
>>>>>> 3 in x,xx,ret= 9.999 9.999 100.02
>>>>>> $par
>>>>>> [1] 10
>>>>>>
>>>>>> $value
>>>>>> [1] 100
>>>>>>
>>>>>> $counts
>>>>>> function gradient
>>>>>>          1        1
>>>>>>
>>>>>> $convergence
>>>>>> [1] 0
>>>>>>
>>>>>> $message
>>>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>>>>>
>>>>>> At the third call, xx has value 9.999 while it should have kept the
>>>>>> value 10.001.
>>>>>>
>>>>> A little follow-up: if you untie the link between xx and x by replacing
>>>>> the expression "xx <<- x" by "xx <<- x+0" it works as expected:
>>>>>    > fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in
>>>>> x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<-
>>>>> x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})
>>>>>    > optim(par=10, fn=fn, method="L-BFGS-B")
>>>>> 1 in x,xx,ret= 10 NA NA
>>>>> out x,xx,ret= 10 10 100
>>>>> 2 in x,xx,ret= 10.001 10 100
>>>>> out x,xx,ret= 10.001 10.001 100.02
>>>>> 3 in x,xx,ret= 9.999 10.001 100.02
>>>>> out x,xx,ret= 9.999 9.999 99.98
>>>>> 4 in x,xx,ret= 9 9.999 99.98
>>>>> out x,xx,ret= 9 9 81
>>>>> 5 in x,xx,ret= 9.001 9 81
>>>>> out x,xx,ret= 9.001 9.001 81.018
>>>>> 6 in x,xx,ret= 8.999 9.001 81.018
>>>>> out x,xx,ret= 8.999 8.999 80.982
>>>>> 7 in x,xx,ret= 1.776357e-11 8.999 80.982
>>>>> out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
>>>>> 8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
>>>>> out x,xx,ret= 0.001 0.001 1e-06
>>>>> 9 in x,xx,ret= -0.001 0.001 1e-06
>>>>> out x,xx,ret= -0.001 -0.001 1e-06
>>>>> 10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
>>>>> out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
>>>>> 11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
>>>>> out x,xx,ret= 0.001 0.001 1e-06
>>>>> 12 in x,xx,ret= -0.001 0.001 1e-06
>>>>> out x,xx,ret= -0.001 -0.001 1e-06
>>>>> $par
>>>>> [1] -1.334475e-23
>>>>> $value
>>>>> [1] 1.780823e-46
>>>>> $counts
>>>>> function gradient
>>>>>           4        4
>>>>> $convergence
>>>>> [1] 0
>>>>> $message
>>>>> [1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
>>>>> Serguei.
>>>>> ______________________________________________
>>>>> R-devel using r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>> ______________________________________________
>>>> R-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel