[R] OLS Regression diagnostic measures check list - what to consider?

Greg Snow Greg.Snow at imail.org
Wed May 5 20:29:16 CEST 2010


First a note, while that is a nice list, I think it needs a disclaimer about only running tests that answer a meaningful question for the data/problem being studied.  If all those tests are run on datasets, I would be most suspicious of those datasets which passed all the tests.  Also, failing some of those tests does not mean that there is a problem with the regression model or its inferences.

This leads to what I think needs to be included on such lists (or replace such lists):  The methods described in the paper:

Buja, A., Cook, D. Hofmann, H., Lawrence, M. Lee, E.-K., Swayne,
     D.F and Wickham, H. (2009) Statistical Inference for exploratory
     data analysis and model diagnostics Phil. Trans. R. Soc. A 2009
     367, 4361-4383 doi: 10.1098/rsta.2009.0120

Which in short says to create several plots, one is the residual (or other) plot from the real data, the rest are based on simulated data that fulfills all the assumptions.  If you cannot tell which plot is "real", then any violations of the assumptions are not practically significant.

The vis.test function in the TeachingDemos package implements a version of this test.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Tal Galili
> Sent: Wednesday, May 05, 2010 5:20 AM
> To: r-help at r-project.org
> Subject: [R] OLS Regression diagnostic measures check list - what to
> consider?
> 
> Hello dear R help list,
> 
> I wish to compile a check-list for diagnostic measures for OLS
> regression.
> 
> My question:
> Can you offer more (or newer) tests/measures for the validity of a
> linear
> model then what is given here:
> http://www.statmethods.net/stats/rdiagnostics.html
> 
> This resource gives a list of measures to test for:
> OUTLIERS, INFLUENTIAL OBSERVATIONS, NON-NORMALITY, NON-CONSTANT ERROR
> VARIANCE, MULTI-COLLINEARITY, NONLINEARITY, NON-INDEPENDENCE OF ERRORS
> and
> some global validation.
> 
> I came across it after searching online for ways to validate a
> regression
> model.
> Although this is a great list, I am wondering if there is any newer
> methods
> that are overlooked, or important consideration to take into account
> that
> are not described in that page.
> 
> 
> Thanks,
> Tal
> 
> 
> 
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew)
> |
> www.r-statistics.com (English)
> -----------------------------------------------------------------------
> -----------------------
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list