[R] Handling outliers?

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Mar 2 12:32:23 CET 2006


It is not clear what sort of analysis you are doing, and for example 
robust/resistant regression is a way of identifying and downweighting 
outliers in a regression analysis.

Also, multivariate outliers are a very different concept from univariate 
ones, and the difference may or may not matter depending on the analysis.

On Thu, 2 Mar 2006, Robert Lundqvist wrote:

> I am sitting with this fairly big material (20 variables, max length of
> vectors about 3200 observations and a substantial amount of missing
> values). In some cases there are also outliers. Some are obvious, others
> are not that clear.
>
> So far, I have replaced some of the outliers with NA's. However, I would
> like to have a good working procedure where outliers where not excluded
> permanently but rather temporarily. Some way of "marking"  observations
> and still keep them seems both preferable and possible.

Depends what `keep' means, but in one sense that is what 
na.action = na.exclude does.

> Any suggestions for a good working practice for cases like this? How do
> *you* work? Is there any "standard" package to use?
>
> Robert

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list