[Rd] strange `nls' behaviour

Joerg van den Hoff j.van_den_hoff at fzd.de
Tue Nov 6 21:55:05 CET 2007


dear list,

I stumbled over the follwing strange behaviour/error when
using `nls' which I'm tempted (despite the implied "dangers")
to call a bug:

I've written a driver for `nls' which allows specifying the
model and the data vectors using arbitrary symbols.
these are internally mapped to consistent names, which
poses a slight complication when using `deriv' to provide
analytic derivatives. the following fragment gives the idea:
#-----------------------------------------
f <- function(n = 4) {

   x <- 1:n

   y <- 2 * exp(-1*x) + 2; 
   y <- rnorm(y,y, 0.01*y)

   model <- y ~ a * exp (-b*x) + c

   fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x"))
   res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))

   call.fitfunc <- 
   c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x"))
   call.fitfunc <- as.call(call.fitfunc)
   frml <- as.formula("y ~ eval(call.fitfunc)")
   res2 <- nls(frml, start = c(a=1, b=1, c=1))

   list(res1 = res1, res2 = res2)
}
#-----------------------------------------

the first call to `nls' is the standard way of calling `nls'
when knowing all the names. the second call (yielding `res2')
uses a constructed formula in `frml' (which in this example
is of course not necessary, but in the general case 'a,b,c,x,y' are
not a priori known names).

here is the problem: the call

f(4)

using 4 data points
runs fine/consistently, as does every call with n > 5.

BUT: for n = 5 (i.e. issuing f(5))
the second fit leads to the error message:

"Error in model.frame(formula, rownames, variables, varnames, extras, extranames,  : 
	invalid type (language) for variable 'call.fitfunc'"

I cornered this to a spot in `nls' where a model frame is constructed in variable `mf'.
the parsing/constructing here seems simply to be messed up for n = 5: `call.fitfunc'
is interpreted as variable.

I, moreover, empirically noted that the problem occurs when the total number
of parameters plus dependent/independent variables equals the number of data points
(in the present example a,b,c,x,y).

so it is not the 'magic' number of 5 but rather the identity of data vector length
and number of variables in the model which leads to the problem.


this is with 2.5.0 (which hopefully is not considered ancient) and MacOSX 10.4.10.

any ideas?

thanks

joerg



More information about the R-devel mailing list