[Rd] gaussian family change suggestion

Simon Wood sw283 at maths.bath.ac.uk
Tue Apr 11 17:26:03 CEST 2006


> >>> Currently the `gaussian' family's initialization code signals an error if
> >>> any response data are zero or negative and a log link is used. Given that
> >>> zero or negative response data are perfectly legitimate under the GLM
> >>> fitted using `gaussian("log")', this seems a bit unsatisfactory. Might
> >>> it be worth changing it?
> >>
> >> Well, that's not really true: it says it is unable to find starting
> >> values, and asks you to provide some.  I don't really see why your choices
> >> are satisfactory: what happens if all the data are negative for example?
> >
> > - then the suggested code traps it.
>
> Yes, but why (it can happen too)?

- isn't this (i.e. all response <=0 ) a case where the MLE's of the
linear predictor must tend to minus infinity? So there's probably  an
argument for handling it differently [although my proposed error message
is not the most informative, since modifying the starting values won't help].

best,
Simon

>
> >> We could try even harder, but code that is almost never used tends to get
> >> broken whilst no one is testing it.  So if you want to pursue it I think
> >> we need a comprehensive test suite.
> >
> > - well, I guess, but I wouldn't have thought that zeros in data modelled
> > using `gaussian("log")' is such a rare occurance is it? [Or did you mean
> > that `gaussian("log")' is almost never used, and should hence be kept
> > simple].
>
> I meant initialization code for this case would rarely get used, as I
> suppose those people who do this are used to providing starting values
> (or there are few or no such people).
>
> >
> > I suppose there are arguments both ways...
> >
> > best,
> > Simon
> >
> >>
> >>> The current offending code from `gaussian' is:
> >>>
> >>> initialize = expression({
> >>>            n <- rep.int(1, nobs)
> >>>            if (is.null(etastart) && is.null(start) && is.null(mustart) &&
> >>>                ((family$link == "inverse" && any(y == 0)) ||
> >>>                  (family$link == "log" && any(y <= 0)))) stop(
> >>>              "cannot find valid starting values: please specify some")
> >>>            mustart <- y
> >>>        })
> >>>
> >>> A possible replacement would be ...
> >>>
> >>> initialize = expression({
> >>>            n <- rep.int(1, nobs)
> >>>            if (is.null(etastart) && is.null(start) && is.null(mustart) &&
> >>>                ((family$link == "inverse" && all(y == 0)) ||
> >>>                  (family$link == "log" && all(y <= 0)))) stop(
> >>>      "cannot find valid starting values: please specify some")
> >>>        mustart <- y
> >>>      if (family$link=="log") {
> >>>        iy <- y<=0
> >>>        if (sum(iy)) mustart[iy] <- min(y[!iy])*.5
> >>>      } else if (family$link=="inverse") {
> >>>        iy <- y==0
> >>>        if (sum(iy)) mustart[iy] <- min(abs(y[!iy]))*.5
> >>>      }
> >>>        })
> >>>
> >>> best,
> >>> Simon
> >>>
> >>>
> >>>> - Simon Wood, Mathematical Sciences, University of Bath, Bath BA2 7AY
> >>>> -             +44 (0)1225 386603         www.maths.bath.ac.uk/~sw283/
> >>>
> >>> ______________________________________________
> >>> R-devel at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>>
> >>
> >> --
> >> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> >> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> >> University of Oxford,             Tel:  +44 1865 272861 (self)
> >> 1 South Parks Road,                     +44 1865 272866 (PA)
> >> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >>
> >
> >
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



More information about the R-devel mailing list