[R] poisson regression with robust error variance ('eyestudy

Paul Johnson pauljohn32 at gmail.com
Thu May 8 17:28:27 CEST 2008


On Thu, May 8, 2008 at 8:38 AM, Ted Harding
<Ted.Harding at manchester.ac.uk> wrote:
> The below is an old thread:
>
>  On 02-Jun-04 10:52:29, Lutz Ph. Breitling wrote:
>  > Dear all,
>  >
>  > i am trying to redo the 'eyestudy' analysis presented on the site
>  > http://www.ats.ucla.edu/stat/stata/faq/relative_risk.htm
>  > with R (1.9.0), with special interest in the section on "relative
>  > risk estimation by poisson regression with robust error variance".
>  >
>  > so i guess rlm is the function to use. but what is its equivalent
>  > to the glm's argument "family" to indicate 'poisson'? or am i
>  > somehow totally wrong and this is not applicable here?
There have been several questions about getting robust standard errors
in glm lately.  I went and read that UCLA website on the RR eye study
and the Zou article that uses a glm with robust standard errors.

I don't think "rlm" is the right way to go because that gives
different parameter estimates.  You need to estimate with glm and then
get standard errors that are adjusted for heteroskedasticity. Well,
you may wish to use rlm for other reasons, but to replicate that
eyestudy project, you need to take the ordinary usual estimates of the
b's and adjust the standard errors.

The Zou article does not give the mathematical formulae used to
estimate those robust errors, it rather gives a code snippit on using
SAS proc GENMOD to get those estimates.  Presumably, if we had access
to the SAS formula, we could easily get the calculations we need with
R. It is a little irksome to me that people think saying "use SAS proc
GENMOD" is informative.  Rather, we really need to know which TYPE of
robust standard error is being considered, since there are about 10
competing formulations.

I started checking various R packages for calculating sandwich
variance estimates.  There is a package "sandwich" for that purpose,
and in the "car" package there is a function hccm that can do so.
The hccm in car refuses to take glm objects, but the function vcovHC
in "sandwich" will do so. The discussion for sandwich's vcovHC
function refers to linear models, and lm objects are used in the
examples, but the vignette "sandwich" distributed with the package
states "The HAC estimators are already available for generalized
linear models (fitted by glm) and robust regression (fitted by rlm in
package MASS)." .

As a result, one can get a var/covar matrix from the routines in the
sandwich package, but I'm not entirely sure they are match the ones
SAS gives.  The vcovHC help page has a nice explanation of the
differences across sandwich estimators.  It says they are all variants
on

(X'X)^{-1} X' Omega X (X'X)^{-1}

With different stipulations about Omega.  If we knew the stipulation
about OMEGA used in the SAS routine, we could try it.

I tried to get the eyestudy data, but the link on the UCLA website is
no longer valid.

So I generated some phony 0-1 data:

y <- as.numeric(rnorm(1000) > 0)
x <- rnorm(1000)
> glm1 <- glm(y~x, family=binomial)
> hccm(glm1)
Error in hccm.lm(glm1) : requires unweighted lm
> vcovH
vcovHAC  vcovHC
> glm1 <- glm(y~x, family=poisson(link=log))
> vcovHC(glm1)
              (Intercept)             x
(Intercept)  1.017588e-03 -3.722456e-05
x           -3.722456e-05  9.492517e-04


The default type of estimation the method called HC3,  which is
recommended by authors of some Monte Carlo research studies.  vcovHC
calculates White's basic correction if you run:


> vcovHC(glm1, type="HC0")
              (Intercept)             x
(Intercept)  1.013508e-03 -3.691381e-05
x           -3.691381e-05  9.417839e-04


Compare against the non-robust glm var/covar matrix.

> vcov(glm1)
              (Intercept)             x
(Intercept)  0.0020152998 -0.0000778422
x           -0.0000778422  0.0018721903


In conclusion, use glm followed by vcovHC and I believe you will find
estimates like the ones provided by SAS or Stata.  But, without access
to their data, I can't be entirely sure.



-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas



More information about the R-help mailing list