[R] Saturated model in binomial glm

Giovanni Petris gpetris at uark.edu
Wed Feb 16 22:58:10 CET 2011


Hi all,

Could somebody be so kind to explain to me what is the saturated model
on which deviance and degrees of freedom are calculated when fitting a
binomial glm?

Everything makes sense if I fit the model using as response a vector of
proportions or a two-column matrix. But when the response is a factor
and counts are specified via the "weights" argument, I am kind of lost
as far as what is the implied saturated model. 

Here is a simple example, based on the UCBAdmissions data.

> UCBAd <- as.data.frame(UCBAdmissions)
> UCBAd <- glm(Admit ~ Gender + Dept, family = binomial,
+                         weights = Freq, data = UCBAd)
> UCBAd$deviance
[1] 5187.488
> UCBAd$df.residual
[1] 17

I can see that the data frame UCBAd has 24 rows and using 1+1+5
parameters to fit the model leaves me with 17 degrees of freedom. 

What is not clear to me is what is the saturated model? 

Is it the model that fits a probability zero to each row corresponding
to failures and a probability one to each row corresponding to
successes? If this is so, it seems to me that looking at the deviance as
a goodness-of-fit statistic does not make much sense in this case. Am I
missing something?

Thank you in advance,
Giovanni


-- 

Giovanni Petris  <GPetris at uark.edu>
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/



More information about the R-help mailing list