[R] GLM and Neg. Binomial models

Thu Oct 13 21:34:16 CEST 2011

D_Tomas <tomasmeca <at> hotmail.com> writes:

> 
> Hi userRs!
> 
> I am trying to fit some GLM-poisson and neg.binomial. The neg. Binomial
> model is to account for over-dispersion.
> 
> When I fit the poisson model i get:
> (Dispersion parameter for poisson family taken to be 1)
> 
> However, if I estimate the dispersion coefficient by means of: 
> sum(residuals(fit,type="pearson")^2)/fit$df.res
> I obtained 2.4. This is theory means over-dispersion since 2.4>>1. 
> 
> I do not understand what the relation is between (Dispersion parameter for
> poisson family taken to be 1) and 2.4.

   This means that the fit that glm() does assumes a scale parameter of
1: that is, it assumes the data are Poisson and does not try to estimate
a scale parameter.  For example, try running example(glm) [to generate
the glm.D93 object, which is the result of a glm() Poisson fit] and
then: summary(update(glm.D93,family=quasipoisson)) -- which will show
you that the dispersion parameter is estimated as 1.2933.  I would
guess that if you use a quasipoisson model you will get an estimated
scale parameter close to 2.4 (maybe not exactly 2.4, since there
are different ways to estimate the dispersion and I don't remember
exactly how it is done in this case).

> 
> In a similar fashion, when i fit the neg. binomial model I obtain:
> (Dispersion parameter for Negative Binomial(0.1717) family taken to be 1)
> Whereas the estimation of the dispersion coefficient as stated above is: 1.4

  Do you mean 2.4?

> 
> Why Dispersion parameter and my calculation are not the same?
> 
> Any thoughts will be much appreciate it .
> 

  This one is a little harder to explain, but here goes: the negative
binomial distribution is not technically in the exponential family *unless*
the dispersion parameter is set to a constant (=0.1717 in this case).
The way glm.nb (which I assume you used) works is that it wraps calls
to glm() in an outer loop which attempts to estimate the dispersion
parameter.  However, this dispersion parameter does not enter the
equations in exactly the same way as a regular scale parameter would
in a standard GLM (e.g. if family were gaussian or Gamma).

  Ben Bolker