[R] se.fit in predict.glm

Wed Apr 28 23:57:20 CEST 2004

(Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> writes:

> The results of
>   confint(g)
> gave me a and b for the lower (2.5%) and upper (97.5%) curve.
> When plotted, these curves lay well outside the "confidence
> bands" obtained from "predict.glm" and were much more realistically
> related to my simulations (19 of my simulated curves nicely packed
> the space between tthe two "confint" curves, and one lay just
> outside -- couldn't have hoped for a result more close to expectation!).

Hmm, that doesn't actually hold up mathematically... You cannot just
take upper/lower limits of both parameters and combine them.

> Nevertheless, I don't think my data were all that few or nasty:
> 
> 23 x-values, roughly equally spaced, with about 12 0/1 results
> at each, and numbers of responses going up as
>   0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 2 4 5 8 8 9 4 10
> 
> So I tend to conclude that the "predict.glm(...,se.fit=TRUE)"
> method should perhaps be avoided in favour of using "confint",
> though I see no indication that "confint" respects the covariance
> of the parameter estimates (intercept and slope) whereas the
> "predict" method in theory does.
> 
> Maybe I'll have another go, after centering the x-values at their
> mean ...

Shouldn't change anything (except maybe demonstrate the fallacy of
your calculation above -- lower b's give higher p's when x is negative).

> Anyway, comments would be appreciated!

I don't seem to get anything that drastic. Things look somewhat better
if you use the link-scale estimates, but the response-scale curves are
not hopeless. You do have to notice that these are pointwise CIs so
having multiple curves straying outside is not necessarily a problem.

Just as a sanity check, did your plots look anything like the below:

y <- c(0,0,0,0,0,0,0,0,0,0,2,0,0,1,0,2,4,5,8,8,9,4,10)
x <- 1:23
d <- data.frame(x,y,n=12)
m1 <- glm(cbind(y,n-y) ~ x,data=d,family=binomial(probit))
confint(m1) # +-2SE approximation
library(MASS)
confint(m1) # profiling
x1 <- with(predict(m1,se.fit=TRUE,type="response"),
          fit+outer(se.fit,c(l=-2,u=2)))
x2 <- with(predict(m1,se.fit=TRUE,type="link"),
          fit+outer(se.fit,c(l=-2,u=2)))
x2 <- pnorm(x2)

ab <- mvrnorm(20,coef(m1),vcov(m1))
matplot(x2,type="l",col="black",lwd=3,lty=1)
matlines(x1,type="l",col="red",lwd=3,lty=1)
matlines(pnorm(t(ab%*%rbind(1,x))))

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907