[R] warning associated with Logistic Regression

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Jan 25 18:48:06 CET 2004


On 25 Jan 2004, Peter Dalgaard wrote:

> David Firth <d.firth at warwick.ac.uk> writes:
> 
> > On Sunday, Jan 25, 2004, at 13:59 Europe/London, Guillem Chust wrote:
> > 
> > > Hi All,
> > >
> > > When I tried to do logistic regression (with high maximum number of
> > > iterations) I got the following warning message
> > >
> > > Warning message:
> > > fitted probabilities numerically 0 or 1 occurred in: (if
> > > (is.empty.model(mt)) glm.fit.null else glm.fit)(x = X, y = Y,
> > >
> > > As I checked from the Archive R-Help mails, it seems that this
> > > happens when
> > > the dataset exhibits complete separation.
> > 
> > Yes.  correct.
> 
> Sufficient but not necessary. It can happen just by numerical roundoff
> if the effect is strong enough. (I have an example with age and
> prevalent menarche: for nearly all women this happens between the age
> of 10 and 18, so if you have a couple of 40-year olds in your data
> set, they'll get a fitted p of 1. Happens even more easily if you
> throw in a cubic term.)

It also happens with partial separation (when some but not all of the
fitted values go to 0/1).  A common case is where only one case occurs for 
some cell in an interaction of factors, and so can be fitted exactly.

Another example is a dataset of say 8,000 people with complete separation
but one got recorded incorrectly -- then the MLE occurs at large but
finite parameter values and cases dissimilar to the erroneous one will
have fitted probabilities very near (but not exactly) 0/1. The asymptotic
theory is valid but practically useless (the Hauck-Donner effect) in such
problems since 8,000 is a small sample.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list