[R] Hosmer-Lemeshow 'goodness of fit'
info at aghmed.fsnet.co.uk
Fri Mar 30 14:23:24 CEST 2012
At 19:34 29/03/2012, JimeBoHe wrote:
>I am a new user in R, so I am sorry if this is a basic question but I am kind
>of lost here and I really would appreciatte your help.
>I have behavioral information of sea lions and I've done a binomial
>Generalized Linear model using Rcmdr, to understand what variables are
>affecting the place where Male's fights are being made (water or land). One
>of my possible models includes Agression Type, number of females around the
>fight and temperature in the moment.
> > GLM.2 <- glm(Place ~ AgType + Females + Temp, family=binomial(cloglog),
> > summary(GLM.2)
>glm(formula = Place ~ AgType + Females + Temp, family = binomial(cloglog),
> data = June)
> Min 1Q Median 3Q Max
>-2.4448 -0.7248 0.4220 0.6741 1.8798
> Estimate Std. Error z value Pr(>|z|)
>(Intercept) 3.43071 0.82164 4.175 2.97e-05 ***
>AgType -1.23463 0.37166 -3.322 0.000894 ***
>Females 0.05812 0.01320 4.404 1.06e-05 ***
>Temp -0.08614 0.02453 -3.512 0.000444 ***
>Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>(Dispersion parameter for binomial family taken to be 1)
> Null deviance: 175.79 on 141 degrees of freedom
>Residual deviance: 125.33 on 138 degrees of freedom
>Number of Fisher Scoring iterations: 8
>When I ran hosmerlem test it seems that I have a good fit to the data.
> > hosmerlem(Place, fitted(GLM.2))
> X^2 Df P(>Chi)
>7.5123428 8.0000000 0.4824925
>However if I run only AgType, or Agtype + Temp, when running Hosmerlem test;
>it gives me an error that says: * ERROR: 'breaks' are not unique*.
>I've been having the same error when I try to run other models or when I use
>different link functions like logit.
>So... right now my questions are:
>1. What is this error means? as far as I have seen in the Hosmerlem test
>code that I am using, it seems to depend on the yhat probs, but to be honest
>I don't really understand what is this "yhat". The code I am using is:
yhat is the predicted value of y (the outcome variable). You do not
provide us with a reproducible example so this is only a guess but I
suspect if you look at the values of yhat for the failing models you
will find you have many replicated values and so at least two values
of the breakpoints are identical. But, as I say, this is at best a guess.
>function (y, yhat, g = 10)
> cutyhat <- cut(yhat, breaks = quantile(yhat, probs = seq(0,
> 1, 1/g)), include.lowest = T)
> obs <- xtabs(cbind(1 - y, y) ~ cutyhat)
> expect <- xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
> chisq <- sum((obs - expect)^2/expect)
> P <- 1 - pchisq(chisq, g - 2)
> c("X^2" = chisq, Df = g - 2, "P(>Chi)" = P)
>2. I've seen in older posts that Cessie - van Houwelingen - Copas - Hosmer
>unweighted sum of squares test for global goodness of fit test; seems to be
>a better alternative. However I am probably doing something wrong because I
>can't get it work.
>I installed and loaded rms package and I ran the following command:
> > resid(GLM.2, 'gof')
>As far as I understand 'gof' is another package with this description:
>Implementation of model-checking technique based on cumulative residuals.
>So, for the command I also installed and loaded 'gof' package
You seem to be so far away from success here that I think remote
advice is not going to help.
>However I get this: *ERROR: 'arg' should be one of "deviance", "pearson",
>"working", "response", "partial"*
>If I use any of the arguments that it suggests, it gives me a list of
>residuals, but I doesn't show the Z or the P.
That is because it is doing what it is documented to. I am not
familiar with the gof function nor do I see it in the index for rms
(as you implied) so it is a bit hard to know what you are trying to do here.
>So, if someone could help me understand what I am doing wrong I really would
>View this message in context:
>Sent from the R help mailing list archive at Nabble.com.
info at aghmed.fsnet.co.uk
More information about the R-help