[R] Pointwise Confidence Bounds on Logistic Regression

Bryan Hanson hanson at depauw.edu
Fri Jun 20 01:54:51 CEST 2008


Thanks so much to all who offered assistance.  I have to say it would have
taken me a long time to figure this out, so I am most grateful.  Plus,
studying your examples greatly improves my understanding.

As a follow up, the fit process gives the following error:

Warning messages:
1: In eval(expr, envir, enclos) :
  non-integer #successes in a binomial glm!

Is this something I should worry about?  It doesn't arise from glm or
glm.fit 

Thanks, Bryan

For the record, a final complete working code is appended below.

x <- c(1:40)
y1 <- c(rep(0.1,10), rep(NA, 10), rep(0.9,20))
y2 <- c(rep(0.1,15), rep(NA, 10), rep(0.9,15))
data <- as.data.frame(cbind(x,y1,y2))
plot(x, y1, pch = 1, ylim = c(0,1), col = "red", main = "Logistic Regression
w/Confidence Bounds", ylab = "y values",  xlab = "x values")
points(x, y2, pch = 3, col = "blue")
abline(h = 0.5, col = "gray")
fit1 <- glm(y1~x, family = binomial(link = "logit"), data, na.action =
na.omit)
fit2 <- glm(y2~x, family = binomial(link = "logit"), data, na.action =
na.omit)
lines(fit1$model$x, fit1$fitted.values, col = "red")
lines(fit2$model$x, fit2$fitted.values, col = "blue")

## predictions on scale of link function
pred1 <- predict(fit1, se.fit = TRUE)
pred2 <- predict(fit2, se.fit = TRUE)

## constant for 95% confidence bands
## getting two t values is redundant here as fit1 and fit2
## have same residual df, but the real world may be different
res.df <- c(fit1$df.residual, fit2$df.residual)
## 0.975 because we want 2.5% in upper and lower tail
const <- qt(0.975, df = res.df)

## confidence bands on scale of link function
upper1 <- pred1$fit + (const[1] * pred1$se.fit)
lower1 <- pred1$fit - (const[1] * pred1$se.fit)
upper2 <- pred2$fit + (const[2] * pred2$se.fit)
lower2 <- pred2$fit - (const[2] * pred2$se.fit)

## bind together into a data frame
bands <- data.frame(upper1, lower1, upper2, lower2)

## transform on to scale of response
bands <- data.frame(lapply(bands, binomial(link = "logit")$linkinv))

## plot confidence bands
lines(fit1$model$x, bands$upper1, col = "pink")
lines(fit1$model$x, bands$lower1, col = "pink")
lines(fit2$model$x, bands$upper2, col = "lightblue")
lines(fit2$model$x, bands$lower2, col = "lightblue")



On 6/19/08 12:28 PM, "Gavin Simpson" <gavin.simpson at ucl.ac.uk> wrote:

> On Thu, 2008-06-19 at 10:42 -0400, Bryan Hanson wrote:
>> [I've ommitted some of the conversation so far...]
>> 
>>> E.g. in a logistic model, with (say) eta = beta_0 + beta_1*x one may
>>> find, on the
>>> linear predictor scale, A and B (say) such that P(A <= eta <= B) = 0.95.
>>> 
>>> Then P(expit(A) <= expit(eta) <= expit(B)) = 0.95, which is exactly
>>> what is wanted.
>> 
>> I think I follow the above conceptually, but I don't know how to implement
>> it, though I fooled around (unsuccessfully) with some of the variations on
>> predict().
>> 
>> I'm trying to learn this in response to a biology colleague who did
>> something similar in SigmaPlot. I can already tell that SigmaPlot did a lot
>> of stuff for him in the background.  The responses are 0/1 of a particular
>> observation by date.  The following code simulates what's going on (note
>> that I didn't use 0/1 since this leads to a recognized condition/warning of
>> fitting 1's and 0's. I've requested Brian's Pattern Recognition book so I
>> know what the problem is and how to solve it).  My colleague is looking at
>> two populations in which the "LD50" would differ.  I'd like to be able to
>> put the "pointwise confidence bounds" on each curve so he can tell if the
>> populations are really different.
>> 
>> By the way, this code does give a (minor?) error from glm (which you will
>> see).
>> 
>> Can you make a suggestion about how to get those "confidence bounds" on
>> there?
>> 
>> Also, is a probit link more appropriate here?
>> 
>> Thanks,  Bryan
>> 
>> x <- c(1:40)
>> y1 <- c(rep(0.1,10), rep(NA, 10), rep(0.9,20))
>> y2 <- c(rep(0.1,15), rep(NA, 10), rep(0.9,15))
>> data <- as.data.frame(cbind(x,y1,y2))
>> plot(x, y1, pch = 1, ylim = c(0,1), col = "red")
>> points(x, y2, pch = 3, col = "blue")
>> abline(h = 0.5, col = "gray")
>> fit1 <- glm(y1~x, family = binomial(link = "logit"), data, na.action =
>> na.omit)
>> fit2 <- glm(y2~x, family = binomial(link = "logit"), data, na.action =
>> na.omit)
>> lines(fit1$model$x, fit1$fitted.values, col = "red")
>> lines(fit2$model$x, fit2$fitted.values, col = "blue")
> 
> The point is to get predictions on the scale of the link function,
> generate 95% confidence bands in the normal way and then transform the
> confidence bands onto the scale of the response using the inverse of the
> link function used to fit the model.
> 
> [note, am doing this from memory, so best to check this is right -- I'm
> sure someone will tell me very quickly if I have gone wrong anywhere!]
> 
> ## predictions on scale of link function
> pred1 <- predict(fit1, se.fit = TRUE)
> pred2 <- predict(fit2, se.fit = TRUE)
> 
> ## constant for 95% confidence bands
> ## getting two t values is redundant here as fit1 and fit2
> ## have same residual df, but the real world may be different
> res.df <- c(fit1$df.residual, fit2$df.residual)
> ## 0.975 because we want 2.5% in upper and lower tail
> const <- qt(0.975, df = res.df)
> 
> ## confidence bands on scale of link function
> upper1 <- pred1$fit + (const[1] * pred1$se.fit)
> lower1 <- pred1$fit - (const[1] * pred1$se.fit)
> upper2 <- pred2$fit + (const[2] * pred2$se.fit)
> lower2 <- pred2$fit - (const[2] * pred2$se.fit)
> 
> ## bind together into a data frame
> bands <- data.frame(upper1, lower1, upper2, lower2)
> 
> ## transform on to scale of response
> bands <- data.frame(lapply(bands, binomial(link = "logit")$linkinv))
> 
> ## plot confidence bands
> lines(fit1$model$x, bands$upper1, col = "red", lty = "dotted")
> lines(fit1$model$x, bands$lower1, col = "red", lty = "dotted")
> lines(fit2$model$x, bands$upper2, col = "blue", lty = "dotted")
> lines(fit2$model$x, bands$lower2, col = "blue", lty = "dotted")
> 
> HTH
> 
> G
> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list