[R] Robust standard errors in logistic regression

Robert Duval rduval at gmail.com
Thu Jul 6 03:11:46 CEST 2006


This discussion leads to another point which is more subtle, but more
important...

You can always get Huber-White (a.k.a robust) estimators of the
standard errors even in non-linear models like the logistic
regression. However, if you beleive your errors do not satisfy the
standard assumptions of the model, then you should not be running that
model as this might lead to biased parameter estimates.

For instance, in the linear regression model you have consistent
parameter estimates independently of whethere the errors are
heteroskedastic or not. However, in the case of non-linear models it
is usually the case that heteroskedasticity will lead to biased
parameter estimates (unless you fix it explicitly somehow).

Stata is famous for providing Huber-White std. errors in most of their
regression estimates, whether linear or non-linear. But this is
nonsensical in the non-linear models since in these cases you would be
consistently estimating the standard errors of inconsistent
parameters.

This point and potential solutions to this problem is nicely discussed
in Wooldrige's Econometric Analysis of Cross Section and Panel Data.






On 7/5/06, Thomas Lumley <tlumley at u.washington.edu> wrote:
> On Wed, 5 Jul 2006, Martin Maechler wrote:
> >>>>>> "Celso" == Celso Barros <celso.barros at gmail.com>
> >>>>>>     on Wed, 5 Jul 2006 04:50:29 -0300 writes:
> >
> > [...............]
> >    Celso> By the way, I was wondering if there is a way to use rlm (from MASS)
> >    Celso> to estimate robust standard errors for logistic regression?
> >
> > rlm stands for 'robust lm'.  What you need here is  'robust glm'.
> >
> > I've already replied to a similar message by you,
> > mentioning the (relatively) new package "robustbase".
> > After installing it, you can
> > use
> >       robustbase::glmrob()
>
> We have a clash of terminology here.  The "robust standard errors" that
> "sandwich" and "robcov" give are almost completely unrelated to glmrob().
> My guess is that Celso wants glmrob(), but I don't know for sure.
>
> The Huber/White sandwich variance estimator for parameters in an ordinary
> generalized linear model gives an estimate of the variance that is
> consistent if the systematic part of the model is correctly specified and
> conservative otherwise.  It is a computationally cheap linear
> approximation to the bootstrap.  These variance estimators seem to usually
> be called "model-robust", though I prefer Nils Hjort's suggestion of
> "model-agnostic", which avoids confusion with "robust statistics". This is
> what sandwich and robcov() do.
>
> glmrob() and rlm() give robust estimation of regression parameters. That
> is, if the data come from a model that is close to the exponential family
> model underlying glm, the estimates will be close to the parameters from
> that exponential family model.  This is a more common statistical sense of
> the term "robust".
>
>
> I think the confusion has been increased by the fact that earlier S
> implementations of robust regression didn't provide standard errors,
> whereas rlm() and glmrob() do. This was partly a quality-of-implementation
> issue and partly because of theoretical difficulties with, eg, lms().
>
>
>         -thomas
>
> Thomas Lumley                   Assoc. Professor, Biostatistics
> tlumley at u.washington.edu        University of Washington, Seattle
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list