[R] Logistic regression - confidence intervals

Frank E Harrell Jr f.harrell at vanderbilt.edu
Wed Feb 8 18:20:26 CET 2006


Cox, Stephen wrote:
> Please forgive a rather naïve question... 
> 
> Could someone please give a quick explanation for the differences in conf intervals achieved via confint.glm (based on profile liklihoods) and the intervals achieved using the Design library.
> 
> For example, the intervals in the following two outputs are different.
> 
> library(Design)
> x = rnorm(100)
> y = gl(2,50)
> d = data.frame(x = x, y = y)
> dd = datadist(d); options(datadist = 'dd')
> m1 = lrm(y~x, data = d)
> summary(m1)
> 
> m2 = glm(y~x, family = binomial, data = d)
> confint(m2)
> 
> I have spent time trying to figure this out via archives, but have not had much luck.
> 
> Regards
> 
> Stephen

Design uses Wald(large sample normality of parameter estimates) -based 
confidence intervals.  These are good for most situations but profile 
confidence intervals are preferred.   Someday I'll make Design do those.

One advantage to Wald statistics is that they extend readily to cluster 
sampling (e.g., using cluster sandwich covariance estimators) and other 
complications (e.g., adjustment of variances for multiple imputation), 
whereas likelihood ratio statistics do not (unless e.g. you have an 
explicit model for the correlation structure or other facits of the model).

Also note that confint is probably giving a confidence interval for a 
one-unit change in x whereas summary.Design is computing an 
interquartile-range effect (difference in x-values is shown in the 
summary output).

When posting a nice simulated example it's best to do 
set.seed(something) so everyone will get the same data.

Frank

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list