[R] binomial GLM quasi separation

Gavin Simpson gavin.simpson at ucl.ac.uk
Mon Oct 17 14:07:18 CEST 2011


On Sat, 2011-10-15 at 09:11 -0700, lincoln wrote:
> #Uwe:
<snip />
> #Gavin:
> 
> I have read carefully your thread but I am not sure to understand what you
> are suggesting (my gaps in statistics!). You say that it should be due to
> the /Hauck Donner/ effect and that it is not a quasi separation or
> separation issue. Even though, I am still unsure to understand why I found
> such a high asymptotic standard error.

I don't believe this is a separation issue - the sorts of things we'd
expect to see if this were due to separation do not show up.

Given the large estimate for the coefficient for the term it is not that
surprising that the associated uncertainty is also high:

> set.seed(2)
> var(runif(100))
[1] 0.08911998
> set.seed(2)
> var(runif(100)*10000)
[1] 8911998

All I did there was increase the "units" of the data in the second
example and the variance is huge, but only because the data were
expressed in units 10000 times bigger than the first example. In the
same way, the coefficient estimate is large so it's standard error is
also large; the question one needs to ask is, is the estimate of the
coefficient for hcp bounded away from zero, given the uncertainty in the
estimate.

If you were to produce a profile confidence interval it too would be
large.

So you have a large estimate, which is somewhat uncertain. Given that
the slope of the log likelihood is low at the estimate and quite
different from the slope at \beta == 0, it is not unreasonable to assume
that the Hauck Donner effect might be present...

> Anyway, how should I consider this result? Should I find another way to
> analyze this process or I could consider this as correct?

...however, in the case of the snippet of data you showed, it doesn't
affect the result - on the basis of the Wald test you would still accept
that hcp is significant/important. The Hauck Donner effect might be
leading to a lower value of the test statistic, but it hasn't affected
the outcome of the test.

To check, fit the model with and without hcp and then use the anova()
function to compare the two models. This will do a likelihood ratio
test.

> If I am understanding this enough, this warning message

Possibly, but it could just be that the fitted probabilities really are
0 or 1.

>  and the very high
> estimates should be due to  /Hauck-Donner/. Regarding that reference to
> Venables and Ripley (2002) on this issue, I have found this ( 
> http://kups.ku.edu/maillist/classes/ps707/2005/msg00023.html Hauck-Donner  )
> where it is said that "The practical advice, then, is to run the model with
> all of the variables, and then run again with the questionable one removed,
> and conduct a likelihood ratio test./ and I suppose that the p-values for
> hcp should be the LRT p-value, isn't it?

Yes. Well it is the result of applying a likelihood ratio test. I don't
think there is such a thing as *the* p-value for a term in a model, just
different ways of computing *a* p-value.

In this case, what does it matter? If the Wald test is *under*estimating
z but the term *is* still significant, the LRT should only confirm this
and give an even lower p-value than the already very low one.

> Thanks for taking your time to help me in this.

Would it hurt you to reply via an email? Regardless of what Nabble
thinks, R-help is a mailing list and your *posts* keep on removing all
the context - I have to keep on hunting for the thread in the archives
just to keep track of what you have told us.

G

> Simone
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/binomial-GLM-quasi-separation-tp3901687p3907716.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list