[R] "glm" function question

Gregor Gorjanc gregor.gorjanc at bfro.uni-lj.si
Sun Oct 22 03:02:50 CEST 2006


Chris Linton <connect.chris <at> gmail.com> writes:

> 
> I am creating a model attempting to predict the probability someone will
> reoffend after being caught for a crime.  There are seven total inputs and I
> planned on using a logistic regression.  I started with a null deviance of
> 182.91 and ended up with a residual deviance of 83.40 after accounting for
> different interactions and such.  However, I realized after that my code is
> different from that in my book.  And I can't figure out what I need to put
> in it's place.  Here's my code:
> 
...
> fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior +
> factor(violence):factor(subst) + factor(violence):factor(educ) +
> factor(violence):factor(age) + factor(violence):factor(prior))
> 
> summary(fit1h)
> 
> If you noticed, there's no part of my code that looks like:
> 
> family=binomial(link="logit"))
> 
...
> 
> However, when I do this, my null deviance is 1104 and my residual deviance
> is 23460.  THIS IS A HUGE DIFFERENCE IN MODEL FIT!  I'm not sure if I have
> to redo my model or if my book was simply doing the
> "family=binomial(link="logit")" for a specific problem/reason.

You state that you model the *probability* that ... Then family=gaussian, which
is the default data generation model in glm is not appropriate. Yes, you need to
use family=binomial(link="logit") or family=binomial(link="probit"), but you
also need to take care in proper specification of your y in the glm call.

Gregor



More information about the R-help mailing list