[R] poisson regression with robust error variance ('eyestudy

Paul Johnson pauljohn32 at gmail.com
Thu May 8 22:35:38 CEST 2008


Ted Harding said:
> I can get the estimated RRs from

> RRs <- exp(summary(GLM)$coef[,1])

> but do not see how to implement confidence intervals based
> on "robust error variances" using the output in GLM.


Thanks for the link to the data.  Here's my best guess.   If you use
the following approach, with the HC0 type of robust standard errors in
the "sandwich" package (thanks to Achim Zeileis), you get "almost" the
same numbers as that Stata output gives. The estimated b's from the
glm match exactly, but the robust standard errors are a bit off.



### Paul Johnson 2008-05-08
### sandwichGLM.R
system("wget http://www.ats.ucla.edu/stat/stata/faq/eyestudy.dta")

library(foreign)

dat <- read.dta("eyestudy.dta")

### Ach, stata codes factor contrasts backwards

dat$carrot0 <- ifelse(dat$carrot==0,1,0)
dat$gender1 <- ifelse(dat$gender==1,1,0)

glm1 <- glm(lenses~carrot0,  data=dat, family=poisson(link=log))

summary(glm1)

library(sandwich)

vcovHC(glm1)


sqrt(diag(vcovHC(glm1)))


sqrt(diag(vcovHC(glm1, type="HC0")))

### Result:
# > summary(glm1)
# Call:
# glm(formula = lenses ~ carrot0, family = poisson(link = log),
#    data = dat)

# Deviance Residuals:
#     Min       1Q   Median       3Q      Max
# -1.1429  -0.9075   0.3979   0.3979   0.7734

# Coefficients:
#            Estimate Std. Error z value Pr(>|z|)
# (Intercept)  -0.8873     0.2182  -4.066 4.78e-05 ***
# carrot0       0.4612     0.2808   1.642    0.101
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# (Dispersion parameter for poisson family taken to be 1)

#     Null deviance: 67.297  on 99  degrees of freedom
# Residual deviance: 64.536  on 98  degrees of freedom
# AIC: 174.54

# Number of Fisher Scoring iterations: 5

# > sqrt(diag(vcovHC(glm1, type="HC0")))
# (Intercept)     carrot0
#  0.1673655   0.1971117
# >

### Compare against
## http://www.ats.ucla.edu/stat/stata/faq/relative_risk.htm
## robust standard errors are:
## .1682086  .1981048


glm2 <- glm(lenses~carrot0 +gender1 +latitude, data=dat,
family=poisson(link=log))


sqrt(diag(vcovHC(glm2, type="HC0")))

### Result:


# > summary(glm2)

#Call:
# glm(formula = lenses ~ carrot0 + gender1 + latitude, family =
poisson(link = log),
#     data = dat)

# Deviance Residuals:
#     Min       1Q   Median       3Q      Max
# -1.2332  -0.9316   0.2437   0.5470   0.9466

# Coefficients:
#            Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.65212    0.69820  -0.934   0.3503
# carrot0      0.48322    0.28310   1.707   0.0878 .
# gender1      0.20520    0.27807   0.738   0.4605
# latitude    -0.01001    0.01898  -0.527   0.5980
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# (Dispersion parameter for poisson family taken to be 1)

#    Null deviance: 67.297  on 99  degrees of freedom
# Residual deviance: 63.762  on 96  degrees of freedom
# AIC: 177.76

# Number of Fisher Scoring iterations: 5

# > sqrt(diag(vcovHC(glm2, type="HC0")))
# (Intercept)     carrot0     gender1    latitude
# 0.49044963  0.19537743  0.18481067  0.01275001

### UCLA site has:

## .4929193   .1963616  .1857414  .0128142


So, either there is some "rounding error" or Stata is not using
exactly the same algorithm as vcovHC in sandwich.

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas



More information about the R-help mailing list