[R] logistic regression

Thu Mar 15 20:38:54 CET 2007

On 15-Mar-07 17:03:50, Milton Cezar Ribeiro wrote:
> Dear All,
> 
> I would like adjust and know the "R2" of following presence/absence
> data:
> 
> x<-1:10
> y<-c(0,0,0,0,0,0,0,1,1,1)
> 
> I tryed use clogit (survival package) but it don´t worked. 
> 
> Any idea?
> 
> miltinho

You are trying to fit an equation

  P[y = 1 ; x] = exp((x-a)/b))/(1 + exp((x-a)/b))

to data

  x =   1   2   3   4   5   6   7   8   9  10

  y =   0   0   0   0   0   0   0   1   1   1

by what amounts to a maximum-likelihood method, i.e. which
chooses the parameter values to maximize the probability of
the observed values of y (given the values of x).

The maximum probability possible is 1, so if you can find
parameters which make P[y = 1] = 0 for x = 1, 2, ... , 7
and P[y = 1] for x = 8, 9, 10 then you have done it.

This will be approximated as closely as you please for any
value of a between 7 and 8, and sufficiently small values of b,
since for such parameter values P[y = 1 ; x] -> 0 for x < a,
and -> 1 for x > a.

You therefore have a solution which is both indeterminate
(any a such that 7 < a < 8) and singular (b -> 0). So it
will defeat standard estimation methods.

That is the source of your problem. In a more general context,
this is an instance of the "linear separation" problem in
logistic regression (and similar methods, such a probit
analysis). Basically, this situation implies that, according
to the data, there is a perfect prediction for the results.

There is no well-defined way of dealing with it; any approach
starts from the proposition "this perfect prediction is not
a reasonable result in the context of my data", and continues
by following up what you think should be meant by "not a
reasonable result". What this is likely to mean would be on
the lines of "b should not be that small", which then imposes
upon you the need to be more specific about how small b may
reasonably be. Then carry on from there (perhaps by fixing
the value of b at different reasonable levels, and simply
fitting a for each value of b).

Hoping this helps ... but I'm wondering how it happens that
you have such data ... ??

best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 15-Mar-07                                       Time: 19:38:51
------------------------------ XFMail ------------------------------