[R] nls problem: singular gradient

Thu Jul 12 11:34:20 CEST 2012

On Jul 11, 2012, at 20:34 , Jonas Stein wrote:

>> Take a look at the predicted values at your starting fit:  there's a 
>> discontinuity at 0.4, which sure makes it look as though overflow is 
>> occurring.  I'd recommend expanding tanh() in terms of exponentials and 
>> rewrite the prediction in a way that won't overflow.
>> 
>> Duncan Murdoch
> 
> Hi Duncan,
> Thank you for your suggestion. I wrote a function "mytanh" and 
> nls terminates a bit later with another error message:
> 
> Error in nls(data = dd, y ~ 1/2 * (1 - mytanh((x - ttt)/1e-04) * exp(-x/tau2)),  : 
>  number of iterations exceeded maximum of 50
> 
> How can i fix that?
> Kind regards,
> Jonas
> 
> ============================ R CODE STARTS HERE =======
> 
> mytanh <- function(x){
>  return(x - x^3/3 + 2*x^5 /15 - 17 * x^7/315)
> }

Ouch! Your original had something that was very nearly a Heaviside function, i.e. a function that instantaneously switches from 1 to 0 at x=ttt. Replace that with a polynomial and I bet that your fitted curves has no resemblance to the original.

Change-point estimation is notoriously tricky because the sum of squares changes discontinuously as points cross "to the other side" and there will be regions where the objective function is constant, because you can move the change point and still have the same sets of points on each side of. Your function tries to remedy this by using a soft threshold, but it is still quite an abrupt change: if you put ttt in the middle of an interval of length 0.001. Then abs((x-ttt)/0.001) will be at least 5 and 

> tanh(+5)
[1] 0.9999092

Furthermore, the derivative of tanh at that point is roughly 

> (tanh(4.999) - tanh(5.001))/0.002
[1] -0.0001815834

I.e. moving the change point yields only a very small change in the fitted value at the neighboring x values and hardly a change at all at any other point. This is where your singular gradient comes from. 

Pragmatically, you could try a larger value than 0.0001, but I suspect it would be wise to supplement any gradient technique with a more direct search procedure.

Overall, I'd say that you need a bit more "Fingerspitzgefühl" with this sort of optimization problem. Try this, for instance:

> x <- seq(0,1,.001)
> y <- (x > 1/pi) + rnorm(x, sd=.01)
> plot(x,y)
> a <- seq(0,1,0.01)
> S <- sapply(a, function(a) sum((y - (x>a))^2)) # SSD
> plot(a,S) # Looks easy enough?
> a <- seq(.31,.32,0.0001) # Now zoom in
> S <- sapply(a, function(a) sum((y - (x>a))^2))
> plot(a,S) # Oops, try soft threshold
> S <- sapply(a, function(a) sum((y - 0.5*(tanh((x-a)/.0001)+1))^2))
> plot(a,S) # Umm, maybe soften a bit more
> S <- sapply(a, function(a) sum((y - 0.5*(tanh((x-a)/.001)+1))^2))
> plot(a,S)

-pd

> 
> t <- seq(0,1,0.001)
> t0 <- 0.5
> tau1 <- 0.02
> 
> yy <- 1/2 * ( 1- tanh((t - t0)/0.0001) * exp(-t / tau1) ) + rnorm(length(t))*0.001
> 
> plot(x=t, y=yy, pch=18)
> 
> dd <- data.frame(y=yy, x=t)
> 
> nlsfit <- nls(data=dd,  y ~  1/2 * ( 1- mytanh((x - ttt)/0.0001) * exp(-x / tau2) ), start=list(ttt=0.5, tau2=0.02) , trace=TRUE)
> 
> ============================ R CODE ENDS HERE =======
> 
> -- 
> Jonas Stein <news at jonasstein.de>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com