[R] glm with binomial errors - problem with overdispersion

Tue Jun 14 08:13:35 CEST 2011

I presume you intended 'type' and 'fragment' to be factors (see 
below).  Such a model would fit exactly.  The additive model

> model <- glm(y ~ fragment+type, binomial)

is only modestly over-dispersed, and shows that 'fragment' has zero 
effect.  Not 'a negligible effect', but no effect.  So something 
really odd is going on: is this an exercise with artificial data?
Otherwise you need to explain the exact balance between the two 
'fragments' (each fragment has exactly 1/4 success) and your 
assumption of independent binomial sampling cannot be true.

Using a quasibinomial model does not change the deviance (see e.g. 
McCullagh and Nelder for the definitions, including of 'scaled 
deviance')), but it does change the standard errors.

On Mon, 13 Jun 2011, Anna Mill wrote:

> Dear all,
>
> I am new to R and my question may be trivial to you...
> I am doing a GLM with binomial errors to compare proportions of species in
> different categories of seed sizes (4 categories) between 2 sites.

You have types and fragments but no species and no sites.  At least 
'sites' should be a factor, as should 'categories of seed sizes'.

> In the model summary the residual deviance is much higher than the degree
> of freedom (Residual deviance: 153.74  on 4  degrees of freedom) and even
> after correcting for overdispersion by using a quasibinomial error structure
> instead of binomial the residual deviance does not change. Is this a data
> problem and I cannot use this statistic or is it because I do something
> wrong with R (see models attached)?
>
> Thanks a lot for your help!
> Anna
>
>
> first model with binomial error structure:
>
>> success<-c(14,43,44,1,13,28,56,8)
>> failure<-c(88,59,58,101,92,77,49,97)
>> "fragment"<-c(1,1,1,1,2,2,2,2)
>> "type"<-c(1,2,3,4,1,2,3,4)
>> y<-cbind(success,failure)
>> model<-glm(y~fragment*type,binomial)
>> summary(model)
> Call:
> glm(formula = y ~ fragment * type, family = binomial)
>
> Deviance Residuals:
>      1        2        3        4        5        6        7        8
> -4.0175   3.3716   4.5052  -6.0071  -2.8063   0.5449   6.0414  -5.0184
>
> Coefficients:
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept)    0.04433    0.61072   0.073   0.9421
> fragment      -0.65477    0.39001  -1.679   0.0932 .
> type          -0.46664    0.23027  -2.027   0.0427 *
> fragment:type  0.26636    0.14455   1.843   0.0654 .
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
>    Null deviance: 157.96  on 7  degrees of freedom
> Residual deviance: 153.74  on 4  degrees of freedom
> AIC: 196.31
>
> Number of Fisher Scoring iterations: 5
>
> second model with quasibinomial error structure:
>> summary(model2)
>
> Call:
> glm(formula = y ~ fragment * type, family = quasibinomial)
>
> Deviance Residuals:
>      1        2        3        4        5        6        7        8
> -4.0175   3.3716   4.5052  -6.0071  -2.8063   0.5449   6.0414  -5.0184
>
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept)    0.04433    3.63550   0.012    0.991
> fragment      -0.65477    2.32169  -0.282    0.792
> type          -0.46664    1.37073  -0.340    0.751
> fragment:type  0.26636    0.86048   0.310    0.772
>
> (Dispersion parameter for quasibinomial family taken to be 35.43628)
>
>    Null deviance: 157.96  on 7  degrees of freedom
> Residual deviance: 153.74  on 4  degrees of freedom
> AIC: NA
>
> Number of Fisher Scoring iterations: 5
>
> 	[[alternative HTML version deleted]]
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595