[R] logistic regression with glm: cooks distance and dfbetas are different compared to SPSS output

Uwe Ligges ligges at statistik.tu-dortmund.de
Mon May 2 13:16:30 CEST 2011



On 29.04.2011 18:29, "Biedermann, Jürgen" wrote:
> Hi there,
>
> I have the problem, that I'm not able to reproduce the SPSS residual
> statistics (dfbeta and cook's distance) with a simple binary logistic
> regression model obtained in R via the glm-function.
>
> I tried the following:
>
> fit <- glm(y ~ x1 + x2 + x3, data, family=binomial)
>
> cooks.distance(fit)#

Just type stats::cooks.distance.glm and see the definition in R yourself:

function (model, infl = influence(model, do.coef = FALSE), res = 
infl$pear.res, dispersion = summary(model)$dispersion, hat = infl$hat, ...)
{
     p <- model$rank
     res <- (res/(1 - hat))^2 * hat/(dispersion * p)
     res[is.infinite(res)] <- NaN
     res
}
<environment: namespace:stats>

Now you can digg yourself further on. I do not know how to find the 
actually used algorithm from SPSS, hence I cannot tell what is different.

Uwe Ligges



> dfbetas(fit)
>
> When i compare the returned values with the values that I get in SPSS,
> they are different, although the same model is calculated (the
> coefficients are the same etc.)
>
> It seems that different calculation-formulas are used for cooks.distance
> and dfbetas in SPSS compared to R.
>
> Unfortunately I didn't find out, what's the difference in the
> calculation and how I could get R to calculate me the same statistics
> that SPSS uses.
> Or is this an unknown SPSS bug?
>
> Greetings
> Jürgen
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list