[R] Robustness of linear mixed models

Berton Gunter gunter.berton at gene.com
Tue Jun 27 17:29:14 CEST 2006


Below...

> > Hello,
> >
> > with 4 different linear mixed models (continuous dependent) 
> I find that my
> > residuals do not follow the normality assumption 
> (significant Shapiro-Wilk
> > with values equal/higher than 0.976; sample sizes 750 or 
> 1200). I find,
> > instead, that my residuals are really well fitted by a t 
> distribution with
> > dofs' ranging, in the different datasets, from 5 to 12.
> >
> > Should this be considered such a severe violation of the normality
> > assumption as to make model-based inferences invalid?
> 
> For some aspects, yes.  Given that R provides you with the 
> means to fit 
> robust linear models, why not use them and find out if they make a 
> difference to the aspects you are interested in?
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 

Or do your inferences in a way that does not depend on normality, perhaps
via (careful to honor the multilevel sampling assumptions) bootstrapping?

Cautions apply. 

First, linear mixed models is actually a nonlinear modeling technique, as is
robust linear fitting. So the process may be sensitive to initial values  I
believe this was pointed out to me by Professior Ripley, though in a
different context. I would appreciate any more informed comments and
qualifications about this.

Second, both the normal theory inference and bootstrapping are asymptotic
and therefore approximate.  I believe this was the point Prof. Ripley was
making when he said "For **some** aspects..." Comparing results under
various assumptions is always a good idea to check sensitivity to those sets
of assumptions, though it may emphasize the fact that choice of the "right"
analysis may be a complex and application and data specific issue. 

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA



More information about the R-help mailing list