[R] Interpretation of output from glm

Pedro de Barros pbarros at ualg.pt
Tue Nov 8 15:47:16 CET 2005


I am fitting a logistic model to binary data. The response variable is a 
factor (0 or 1) and all predictors are continuous variables. The main 
predictor is LT (I expect a logistic relation between LT and the 
probability of being mature) and the other are variables I expect to modify 
this relation.

I want to test if all predictors contribute significantly for the fit or not
I fit the full model, and get these results

 > summary(HMMaturation.glmfit.Full)

Call:
glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom,
     family = binomial(link = "logit"), data = HMIndSamples)

Deviance Residuals:
     Min       1Q   Median       3Q      Max
-3.0983  -0.7620   0.2540   0.7202   2.0292

Coefficients:
               Estimate Std. Error z value Pr(>|z|)
(Intercept) -8.789e-01  3.694e-01  -2.379  0.01735 *
LT           5.372e-02  1.798e-02   2.987  0.00281 **
CondF       -6.763e-02  9.296e-03  -7.275 3.46e-13 ***
Biom        -1.375e-02  2.005e-03  -6.856 7.07e-12 ***
LT:CondF     2.434e-03  3.813e-04   6.383 1.74e-10 ***
LT:Biom      7.833e-04  9.614e-05   8.148 3.71e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

     Null deviance: 10272.4  on 8224  degrees of freedom
Residual deviance:  7185.8  on 8219  degrees of freedom
AIC: 7197.8

Number of Fisher Scoring iterations: 8

However, when I run anova on the fit, I get
 > anova(HMMaturation.glmfit.Full, test='Chisq')
Analysis of Deviance Table

Model: binomial, link: logit

Response: Mature

Terms added sequentially (first to last)


            Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL                        8224    10272.4
LT          1   2873.8      8223     7398.7       0.0
CondF       1      0.1      8222     7398.5       0.7
Biom        1      0.2      8221     7398.3       0.7
LT:CondF    1    142.1      8220     7256.3 9.413e-33
LT:Biom     1     70.4      8219     7185.8 4.763e-17
Warning message:
fitted probabilities numerically 0 or 1 occurred in: method(x = x[, varseq 
<= i, drop = FALSE], y = object$y, weights = object$prior.weights,


I am having a little difficulty interpreting these results.
The result from the fit tells me that all predictors are significant, while 
the anova indicates that besides LT (the main variable), only the 
interaction of the other terms is significant, but the main effects are not.
I believe that in the first output (on the glm object), the significance of 
all terms is calculated considering each of them alone in the model (i.e. 
removing all other terms), while the anova output is (as it says) 
considering the sequential addition of the terms.

So, there are 2 questions:
a) Can I tell that the interactions are significant, but not the main effects?
b) Is it legitimate to consider a model where the interactions are 
considered, but not the main effects CondF and Biom?

Thanks for any help,

Pedro




More information about the R-help mailing list