[R] [FORGED] deviance in GLM vs. summary.glm

Wed May 31 11:31:02 CEST 2017

On 31/05/17 16:53, array chip via R-help wrote:
> Hi, I am running a logistic regression on a simple dataset (attached) using glm:
>> dat<-read.table("dat.txt",sep='\t',header=T)
> If I use summary() on a logistic model:
>> summary(glm(y~x1*x2,dat,family='binomial'))
> Coefficients:            Estimate Std. Error z value Pr(>|z|)(Intercept)    19.57    5377.01   0.004    0.997x1            -18.59    5377.01  -0.003    0.997x2B           -19.57    5377.01  -0.004    0.997x1:x2B         38.15    7604.24   0.005    0.996
> As you can see, the interaction term is very insignificant (p = 0.996)!
> But if I use a anova() to compare a full vs reduced model to evaluate the interaction term:
>> anova(glm(y~x1+x2,dat,family='binomial'), glm(y~x1*x2,dat,family='binomial'))Analysis of Deviance Table
> Model 1: y ~ x1 + x2Model 2: y ~ x1 * x2  Resid. Df Resid. Dev Df Deviance1        22     27.067            2        21     21.209  1   5.8579
> This follows a chi-square distribution with 1 df, so the corresponding p value is:
>> 1-pchisq(5.8679,1)[1] 0.01541944
> So I get very different p value on the interaction term, can someone share what's going wrong here?

(1) To re-emphasize Berwin Turlach's exhortation, ***PLEASE*** do not 
post in html; your results are nearly impossible to read.

(2) To add a tiny bit to Berwin's comments:

There is something a bit weird about your data; note that if you look at 
the fitted values from your fit, you get only *three* distinct values 
--- whereas there should be four, since there are four distinct
treatment combinations, (0,A), (0,B), (1,A) and (1,B).  And all possible 
treatment combinations do indeed occur.

The deficiency occurs because the last three of your coefficients (the 
non-intercept terms) sum to (effectively) 0.  Thus you get the same 
fitted value from the (1,B) combination as from the (0,A) combination.

It is not clear to me what the implications of all this are --- 
particularly since it's getting on for my bed time :-) --- but it seems 
clear that this sort of "degeneracy" is going to mess things up.
And perhaps contribute to the Hauck-Donner effect that Berwin told you 
about.

I would also advise you to try experimenting with simulated data.  (When 
in doubt, simulate.  Even when not in doubt, simulate.  That is my 
guiding principle.)  I.e. try to simulate y-values from a coefficient 
vector that does not suffer from the "degeneracy", fit a model to the 
simulated y-values, and compare the two tests.  They still won't agree, 
but they will (surely?) be less wildly discrepant.

cheers,

Rolf Turner

-- 
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276