[R] Goodness of fit of binary logistic model

Fri Aug 5 18:35:34 CEST 2011

On Aug 5, 2011, at 12:21 PM, Paul Smith wrote:

> On Fri, Aug 5, 2011 at 4:54 PM, David Winsemius <dwinsemius at comcast.net 
> > wrote:
>>> I have just estimated this model:
>>> -----------------------------------------------------------
>>> Logistic Regression Model
>>>
>>> lrm(formula = Y ~ X16, x = T, y = T)
>>>
>>>                    Model Likelihood     Discrimination    Rank  
>>> Discrim.
>>>                       Ratio Test            Indexes          Indexes
>>>
>>> Obs            82    LR chi2      5.58    R2       0.088     
>>> C       0.607
>>> 0             46    d.f.            1    g        0.488    Dxy      
>>> 0.215
>>> 1             36    Pr(> chi2) 0.0182    gr       1.629    gamma    
>>> 0.589
>>> max |deriv| 9e-11                         gp       0.107    tau- 
>>> a   0.107
>>>                                         Brier    0.231
>>>
>>>         Coef    S.E.   Wald Z Pr(>|Z|)
>>> Intercept -1.3218 0.5627 -2.35  0.0188
>>> X16=1      1.3535 0.6166  2.20  0.0282
>>> -----------------------------------------------------------
>>>
>>> Analyzing the goodness of fit:
>>>
>>> -----------------------------------------------------------
>>>>
>>>> resid(model.lrm,'gof')
>>>
>>> Sum of squared errors     Expected value|H0                    SD
>>>        1.890393e+01          1.890393e+01          6.073415e-16
>>>                   Z                     P
>>>       -8.638125e+04          0.000000e+00
>>> -----------------------------------------------------------
>>>
>>>> From the above calculated p-value (0.000000e+00), one should  
>>>> discard
>>>
>>> this model. However, there is something that is puzzling me: If the
>>> 'Expected value|H0' is so coincidental with the 'Sum of squared
>>> errors', why should one discard the model? I am certainly missing
>>> something.
>>
>> It's hard to tell what you are missing, since you have not  
>> described your
>> reasoning at all. So I guess what is at error is your expectation  
>> that we
>> would have drawn all of the unstated inferences that you draw when  
>> offered
>> the output from lrm. (I certainly did not draw the inference that  
>> "one
>> should discard the model".)
>>
>> resid is a function designed for use with glm and lm models. Why  
>> aren't you
>>  using residuals.lrm?
>
> ----------------------------------------------------------
>> residuals.lrm(model.lrm,'gof')
> Sum of squared errors     Expected value|H0                    SD
>         1.890393e+01          1.890393e+01          6.073415e-16
>                    Z                     P
>        -8.638125e+04          0.000000e+00

Great. Now please answer the more fundamental question. Why do you  
think this mean "discard the model"?

David Winsemius, MD
West Hartford, CT