[R] Chi-square values in GLM model comparison

Marc Schwartz marc_schwartz at me.com
Wed Sep 11 19:50:10 CEST 2013


Torvon,

There is some confusion in your postings, as in your first posting the models were GLM's but with the default gaussian family (not binomial) since the 'family' argument was not present in the glm() call and in the second post you have references to clm() which is for ordinal response cumulative link models in the 'ordinal' CRAN package.

If you want binomial logistic regression models, you need to use:

  m1 <- glm(sym_bin ~ phq_index, data = data2, family = binomial)


As an example, using the ?infert dataset with a single IV:

MOD <- glm(case ~ education, data = infert, family = binomial)

> summary(MOD)

Call:
glm(formula = case ~ education, family = binomial, data = infert)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.9053  -0.9053  -0.9005   1.4765   1.4823  

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)
(Intercept)      -6.931e-01  6.124e-01  -1.132    0.258
education6-11yrs  4.477e-15  6.423e-01   0.000    1.000
education12+ yrs  1.290e-02  6.431e-01   0.020    0.984

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 316.17  on 247  degrees of freedom
Residual deviance: 316.17  on 245  degrees of freedom
AIC: 322.17

Number of Fisher Scoring iterations: 4


> anova(MOD, test = "Chisq")
Analysis of Deviance Table

Model: binomial, link: logit

Response: case

Terms added sequentially (first to last)


          Df  Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL                         247     316.17         
education  2 0.0022894       245     316.17   0.9989


Regards,

Marc Schwartz


On Sep 11, 2013, at 11:17 AM, Torvon <torvon at gmail.com> wrote:

> José,
> 
> I get the following error message:
> 
>> m1<-clm(sym_bin ~ phq_index, data=data2)
>> m2<-clm(sym_bin ~ 1, data=data2)
>> anova(m1,m2,test="Chisq")
> 
>> Error in anova.clm(m1, m2, test = "Chisq") :
>> only 'clm' and 'clmm' objects are allowed
> 
> My dependent variable is binary, so I don't know what the problem could be.
> See below the model summaries. Thank you! Eiko
> 
>> summary(m1)
> formula: sym_bin ~ phq_index
> data:    data2
> 
> link  threshold nobs  logLik   AIC     niter max.grad cond.H
> logit flexible  12348 -4846.49 9710.97 7(0)  2.53e-08 1.4e+02
> 
> Coefficients:
>           Estimate Std. Error z value Pr(>|z|)
> phq_index2 -0.29705    0.11954  -2.485    0.013 *
> phq_index3  0.63382    0.10262   6.176 6.56e-10 ***
> phq_index4  1.53022    0.09664  15.834  < 2e-16 ***
> phq_index5  0.90720    0.09996   9.075  < 2e-16 ***
> phq_index6 -0.03855    0.11337  -0.340    0.734
> phq_index7 -0.06488    0.11394  -0.569    0.569
> phq_index8 -1.15618    0.15156  -7.628 2.38e-14 ***
> phq_index9 -2.50064    0.25670  -9.741  < 2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Threshold coefficients:
>    Estimate Std. Error z value
> 0|1  1.87770    0.07959   23.59
> 
> 
>> summary(m2)
> formula: sym_bin ~ 1
> data:    data2
> 
> link  threshold nobs  logLik   AIC      niter max.grad
> logit flexible  12348 -5472.48 10946.96 5(0)  1.01e-11
> 
> Threshold coefficients:
>  0|1
> 1.642
> 
> 
> 
> 
> 
> 
> 
> On 11 September 2013 18:03, Jose Iparraguirre <
> Jose.Iparraguirre at ageuk.org.uk> wrote:
> 
>> Hi Eiko,
>> 
>> How about this?
>> 
>>> anova (m1, m2, test="Chisq")
>> 
>> See: ?anova.glm
>> 
>> Regards,
>> José
>> 
>> 
>> Prof. José Iparraguirre
>> Chief Economist
>> Age UK
>> 
>> 
>> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Torvon
>> Sent: 11 September 2013 16:48
>> To: r-help at r-project.org
>> Subject: [R] Chi-square values in GLM model comparison
>> 
>> Hello --
>> I am comparing two
>> GLMs (binomial dependent variable)
>> , the results are the following:
>>> m1<-glm(symptoms ~ phq_index, data=data2) m2<-glm(symptoms ~ 1,
>>> data=data2)
>> 
>> Trying to compare these models using
>>> anova (m1, m2)
>> I do not obtain chi-square values or a chi-square difference test;
>> instead, I get loglikelihood ratios:
>> 
>>> Likelihood ratio tests of cumulative link models:
>>> formula: link: threshold:
>>> m2 sym_bin ~ 1         logit flexible
>>> m1 sym_bin ~ phq_index logit flexible
>>>      no.par   AIC   logLik  LR.stat df Pr(>Chisq)
>>> m2      1    10947   -5472.5
>>> m1      9     9711   -4846.5    1252  8  < 2.2e-16 ***
>> 
>> Since reviewers would like me to report chi-square values: how to I obtain
>> them when comparing GLMs? I'm looking for an output similar to the output
>> of the GLMER function in LME4, e.g.:
>> 
>>> anova(m3,m4)
>> ...
>>>      Df   AIC   BIC  logLik Chisq Chi Df Pr(>Chisq)
>>> m3 13 11288 11393 -5630.9
>>> m4 21 11212 11382 -5584.9 92.02      8  < 2.2e-16 ***
>> 
>> Thank you!
>> Eiko



More information about the R-help mailing list