[Rd] gaussian family change suggestion

Thomas Lumley tlumley at u.washington.edu
Tue Apr 11 16:58:36 CEST 2006


On Tue, 11 Apr 2006, Prof Brian Ripley wrote:

> On Tue, 11 Apr 2006, Simon Wood wrote:
>
>> Hi,
>>
>> Currently the `gaussian' family's initialization code signals an error if
>> any response data are zero or negative and a log link is used. Given that
>> zero or negative response data are perfectly legitimate under the GLM
>> fitted using `gaussian("log")', this seems a bit unsatisfactory. Might
>> it be worth changing it?
>
> Well, that's not really true: it says it is unable to find starting
> values, and asks you to provide some.  I don't really see why your choices
> are satisfactory: what happens if all the data are negative for example?
> (You get +Inf in the log case, I believe.  The algorithm will probably get
> stuck from a finite starting point, but it will produce a mysterious error
> from an infinite one.)
>
> We could try even harder, but code that is almost never used tends to get
> broken whilst no one is testing it.  So if you want to pursue it I think
> we need a comprehensive test suite.

I have actually been working with log-link glms quite a bit recently and 
was planning to change the defaults.

 	-thomas



>
>> The current offending code from `gaussian' is:
>>
>> initialize = expression({
>>            n <- rep.int(1, nobs)
>>            if (is.null(etastart) && is.null(start) && is.null(mustart) &&
>>                ((family$link == "inverse" && any(y == 0)) ||
>>                  (family$link == "log" && any(y <= 0)))) stop(
>>              "cannot find valid starting values: please specify some")
>>            mustart <- y
>>        })
>>
>> A possible replacement would be ...
>>
>> initialize = expression({
>>            n <- rep.int(1, nobs)
>>            if (is.null(etastart) && is.null(start) && is.null(mustart) &&
>>                ((family$link == "inverse" && all(y == 0)) ||
>>                  (family$link == "log" && all(y <= 0)))) stop(
>>      "cannot find valid starting values: please specify some")
>>        mustart <- y
>>      if (family$link=="log") {
>>        iy <- y<=0
>>        if (sum(iy)) mustart[iy] <- min(y[!iy])*.5
>>      } else if (family$link=="inverse") {
>>        iy <- y==0
>>        if (sum(iy)) mustart[iy] <- min(abs(y[!iy]))*.5
>>      }
>>        })
>>
>> best,
>> Simon
>>
>>
>>> - Simon Wood, Mathematical Sciences, University of Bath, Bath BA2 7AY
>>> -             +44 (0)1225 386603         www.maths.bath.ac.uk/~sw283/
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-devel mailing list