[R] GLM outputs in condensed versus expanded table

Peter Dalgaard pdalgd at gmail.com
Wed Aug 25 20:57:25 CEST 2010


On 08/25/2010 05:17 PM, francogrex wrote:
> 
> Hi I'm having different outputs from GLM when using a "condensed" table
> V1	V2	V3	Present	Absent
> 0	0	0	3	12
> 0	0	1	0	0
> 0	1	0	0	0
> 0	1	1	1	0
> 1	0	0	7	20
> 1	0	1	0	0
> 1	1	0	3	0
> 1	1	1	6	0
> 
> 
> resp=cbind(Present, Absent)
> glm(resp~V1+V2+V3+I(V1*V2*V3),family=binomial)
>> Deviance Residuals: 
> [1]  0  0  0  0  0  0  0  0
>  etc and also coefficients...
> 
> And when using the same but "expanded" table
> 
> 	V1	V2	V3	condition (1 present 0 abscent)
> Id1	1	0	0	1
> id2	1	1	1	1
> ... etc
> glm(condition~V1+V2+V3+I(V1*V2*V3),family=binomial)
>> Deviance Residuals: 
>         Min          1Q      Median          3Q         Max  
>   -0.7747317  -0.7747317  -0.6680472   0.0001315   1.7941226 
> and also coefficients are different from above.
> 
> What could I be doing wrong?
> 
> 

Not necessarily anything. Anything technical, that is.

You have 3 uninformative combinations where the total is zero. The model
has 5 parameters. This is quite likely to generate a perfect fit to the
aggregated data. With the groups having zeros in the "absent" category,
the fit probably diverged so some coefficients are numerically large.

Refitting with individual data will likely give slightly different
coefficients, since it sort of depends on how far you came on the way to
infinity.

With the aggregated data, a perfect fit gives residuals of zero, but
with individual data, the 0's and 1's give negative and positive
residuals. Try

z <- rep(0:1,5)
zz <- cbind(5,5)

summary(glm(z~1, binomial))
summary(glm(zz~1, binomial))

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list