[R] How does glm(family='binomial') deal with perfect sucess?

Tue Mar 20 22:48:12 CET 2007

On 3/20/07, Mike Lawrence <M4Lawren at uwaterloo.ca> wrote:
> Hi all,
>
> Trying to understand the logistic regression performed by glm (i.e. when
> family='binomial'), and I'm curious to know how it treats perfect
> success. That is, lets say I have the following summary data
>
>         x=c(1,2,3,4,5,6)
>         y=c(0,.04,.26,.76,.94,1)
>         w=c(100,100,100,100,100,100)
>
> where x is y is the probability of success at each value of x,
> calculated across w observations. When I use glm
>
>         my.glm.obj=glm(y~x,family='binomial',weights=w)
>
> the regression comes out fine, but if I try what I understand to be the
> equivalent lm procedure (i.e. fitting a straight line to the logit
> transformed y values):
>
>         my.lm.obj=lm(qlogis(y)~x,weights=w)
>
> I get an error because, of course, logit(1) = log(1/0) = log(Inf) = Inf
> (similarly, logit(0) = log(0/1) = log(0) = -Inf).

> I'd be very interested to see how glm deals with these extremes.

Well, for one thing glm doesn't fit the model that way.  It uses
iteratively reweighted least squares in which the calculation of the
next set of parameter values is performed as a weighted least squares
fit on the scale of the mean, not on the scale of the linear
predictor, for exactly the reason that you have encountered.