Possible bug with glm.nb and starting values (PR#1695)
bcooper@hsph.harvard.edu
bcooper@hsph.harvard.edu
Thu, 20 Jun 2002 18:59:36 +0200 (MET DST)
Full_Name: Ben Cooper
Version: 1.5.0
OS: linux
Submission from: (NULL) (134.174.187.90)
The help page for glm.nb (in MASS package) says that it takes "Any other
arguments for the glm() function except family"
One such argument is start "starting values for the parameters in the linear
predictor."
However, when called with starting values glm.nb returns:
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, :
variable lengths differ
So it looks like this is either a bug, the documentation is inaccurate, or I'm
missing something (or some combination of the above).
An example:
________________________________________________________________
library(MASS)
y<-c(7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6,
12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3,3,4)
lag1<-c(0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6,
12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3,3)
lag2<-c(0,0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6,
12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5,3)
lag3<-c(0,0,0,7,5,4,7,5,2,11,5,5,4,2,3,4,3,5,9,6,7,10,6,
12,6,3,5,3,9,13,0,6,1,2,0,1,0,0,4,5,1,5)
># first a poisson model which is OK
>glm(y~lag1+lag2+lag3,family=poisson(link=identity))
Error: no valid set of coefficients has been found:please supply starting
values
In addition: Warning message:
NaNs produced in: log(x)
> #therefore try:
> glm(y~lag1+lag2+lag3,family=poisson(link=identity),start=c(2,0.1,0.1,0.1))
# and this works. However, negative binomial model is not OK:
> glm.nb(y~lag1+lag2+lag3,link=identity)
Error: no valid set of coefficients has been found:please supply starting
values
In addition: Warning message:
NaNs produced in: log(x)
> #so try
> glm.nb(y~lag1+lag2+lag3,link=identity,start=c(2,0.1,0.1,0.1))
Error in model.frame(formula, rownames, variables, varnames, extras, extranames,
:
variable lengths differ
________________________________________________________________
I can get glm.nb to work with starting values with the following hack to the
glm.nb code:
i)
change the line
m$method <- m$model <- m$x <- m$y <- m$control <- m$contrasts <- m$init.theta
<- m$link <- m$... <- NULL
to
m$method <- m$model <- m$x <- m$y <- m$control <- m$contrasts <- m$init.theta
<- m$link <-m$start <- m$... <- NULL
[i.e. insert m$start]
ii)
remove the line:
start <- model.extract(m, start)
iii)
change the line
fit <- glm.fitter(x = X, y = Y, w = w, etastart = start,
offset = offset, family = fam0, control = list(maxit = control$maxit,
epsilon = control$epsilon, trace = control$trace >
1))
to
fit <- glm.fitter(x = X, y = Y, w = w, start=start,
offset = offset, family = fam0, control = list(maxit = control$maxit,
epsilon = control$epsilon, trace = control$trace >
1))
iv) change default value of start to NULL (as in glm) rather than eta when
glm.nb is called.
newglm.nb with these changes then works.
But no doubt there are reasons for things being the way they are and these
changes screw other things up.
$platform
[1] "i686-pc-linux-gnu"
$arch
[1] "i686"
$os
[1] "linux-gnu"
$system
[1] "i686, linux-gnu"
$status
[1] ""
$major
[1] "1"
$minor
[1] "5.0"
$year
[1] "2002"
$month
[1] "04"
$day
[1] "29"
$language
[1] "R"
I also tried this with version 1.4.1 under Windows and had the same probelm.
cheers
Ben
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._