[R] Robust regression with groups

Berton Gunter gunter.berton at gene.com
Wed Oct 20 17:42:23 CEST 2004


Angelo and Folks:

Beware! It is not at all clear what you mean by "robust" regression. The
sandwich estimator is often said to be "robust" to model misspecification in
the sense that it converges to the correct covariance matrix whether or not
the correlation structure in the GEE has been correctly specified (as
Dmitris implied). Is this what you mean? Mixed effect models are often said
to be "robust" in the sense that individual group "estimators" (blups) are
shrunk toward the overall fixed effect estimates. Is this what you mean?

In other applications, "robustness" can mean insensitivity to distributional
assumptions. Mixed effects models for continupus responses commonly assume
normality (as the estimates solve likelihood equations), as do GLMM's for
the random effects. I know of no definitive work that has examined
sensitivity of estimates (or inferences, which are, at best, asymptotic
anyway) to those assumptions. (in the simple independent errors case, it is
usually the case that estimates are not at all sensitive). However, I am a
novice here, so others may be able to illuminate the issue more.

Finally, "robustness" is often used to mean "outlier resistance." Here the
situation is yet murkier. Do you mean resistance to individual "outlying"
observations within a subject or resistance to outlying subjects? Shrinkage
should help with both, but, again, I know of no definitive work, especially
regarding resistance to individual extreme values. Given the sensitivity of
covariance estimates to heavy tails and the consequent inferential
inefficiency, this presumably could be a problem. Finding methods that could
deal with this may be nearly impossible, as you are adding yet another layer
of nonlinear estimation (that of determining optimal case weights/parameters
for mixture contamination models/or whatever...) to the problem; it is easy
to come up with examples where the data are inherently ambiguous and
parameter estimates for resistant case weights and the model would trade off
with each other depending on starting values. That is, too many nonlinear
parameters are being estimated and the model estimates are therefore
unstable.

Again, I am happy to leave more definitive resolution and correction of any
errors in my comments to the experts, but, at the least, I think you need to
think more and communicate more clearly about what you mean by "robust." 

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of 
> Dimitris Rizopoulos
> Sent: Wednesday, October 20, 2004 7:08 AM
> To: Angelo Secchi
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Robust regression with groups
> 
> Hi Angelo,
> 
> There are two possible options (at least to my knowledge):
> 
> 1. to use a random-effects model, either using `lme' (packages: nlme, 
> lme4) if you have normal data or `glmmPQL' (package: MASS) or `GLMM' 
> (package: lme4) or `glmmML' (package:glmmML) if you cannot use the 
> normal distribution.
> 
> 2. to use a gee model with a robust (sandwich) std.error estimation. 
> See at `gee' (package: gee) and `geese' (package: geepack).
> 
> I hope this helps.
> 
> Best,
> Dimitris
> 
> ----
> Dimitris Rizopoulos
> Ph.D. Student
> Biostatistical Centre
> School of Public Health
> Catholic University of Leuven
> 
> Address: Kapucijnenvoer 35, Leuven, Belgium
> Tel: +32/16/396887
> Fax: +32/16/337015
> Web: http://www.med.kuleuven.ac.be/biostat/
>      http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm
> 
> 
> 
> 
> ----- Original Message ----- 
> From: "Angelo Secchi" <secchi at sssup.it>
> To: <r-help at stat.math.ethz.ch>
> Sent: Wednesday, October 20, 2004 3:22 PM
> Subject: [R] Robust regression with groups
> 
> 
> >
> >
> > Hi,
> > I have data on a group of subjects in different years. I should 
> > assume
> > that observations regarding different individuals are independent 
> > but
> > observations for the same individual in different years are not and 
> > I
> > would like to have an estimated standard error (and 
> > variance-covariance
> > matrix) taking into account this problem.
> >
> > More in general is there a way in R to run a (robust)regression 
> > having
> > different groups in the observations and specifying that the 
> > observation
> > are independent across groups but not necessarily independent within
> > groups?
> >
> > Thanks
> > a.
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list