[R] Goodness of fit of binary logistic model
    peter dalgaard 
    pdalgd at gmail.com
       
    Sat Aug  6 00:47:59 CEST 2011
    
    
  
On Aug 5, 2011, at 23:16 , Paul Smith wrote:
> Thanks, Frank. The following piece of code generate data, which
> exhibit the problem I reported:
> 
> -----------------------------------------
> set.seed(123)
> intercept = -1.32
> beta = 1.36
> xtest = rbinom(1000,1,0.5)
> linpred = intercept + xtest*beta
> prob = exp(linpred)/(1 + exp(linpred))
> runis = runif(1000,0,1)
> ytest = ifelse(runis < prob,1,0)
> xtest <- as.factor(xtest)
> ytest <- as.factor(ytest)
> require(rms)
> model <- lrm(ytest ~ xtest,x=T,y=T)
> model
> residuals.lrm(model,'gof')
> -----------------------------------------
Basically, what you have is zero divided by zero, except that floating point inaccuracy turns it into the ratio of two small numbers. So the Z statistic is effectively rubbish.
This comes about because the SSE minus its expectation has effectively zero variance, which makes it rather useless for testing whether the model fits.
Since the model is basically a full model for a 2x2 table, it is not surprising to me that "goodness of fit" tests behave poorly. In fact, I would conjecture that no sensible g.o.f. test exists for that case.
> 
> Paul
> 
> 
> On Fri, Aug 5, 2011 at 7:58 PM, Frank Harrell <f.harrell at vanderbilt.edu> wrote:
>> Please provide the data or better the R code for simulating the data that
>> shows the problem.  Then we can look further into this.
>> Frank
>> 
>> -----
>> Frank Harrell
>> Department of Biostatistics, Vanderbilt University
>> --
>> View this message in context: http://r.789695.n4.nabble.com/Goodness-of-fit-of-binary-logistic-model-tp3721242p3721997.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
"Døden skal tape!" --- Nordahl Grieg
    
    
More information about the R-help
mailing list