[R] detection of outliers

Berton Gunter gunter.berton at gene.com
Thu Sep 23 17:32:17 CEST 2004


Not to oversimplify ...

1. (At least) dozens of books and thousands of papers have been written on
this...

2. Most important question is: What is an outlier? (Many smart folks says
that the concept is illogical/flawed -- there is no mystical boundary that
one crosses to become a statistical pariah; many other smart folks
disagree).

3. Equivalently: What is the model with respect to which values are
outlying? (with apologies to Winston Churchill's: "That is an indignity up
with which I will not put.")

So good advice here is: Beware of good advice about this. (Of course, I may
just be an outlier ...)

;)

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of 
> Phguardiol at aol.com
> Sent: Thursday, September 23, 2004 7:22 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] detection of outliers
> 
> Hi,
> this is both a statistical and a R question...
> what would the best way / test to detect an outlier value 
> among a series of 10 to 30 values ? for instance if we have 
> the following dataset: 10,11,12,15,20,22,25,30,500 I d like 
> to have a way to identify the last data as an outlier (only 
> one direction). One way would be to calculate abs(mean - 
> median) and if elevated (to what extent ?) delete the extreme 
> data then redo.. but is it valid to do so with so few data ? 
> is the (trimmed mean - mean) more efficient ? if so, what 
> would be the maximal tolerable value to use as a threshold ? 
> (I guess it will be experiment dependent...) tests for 
> skweness will probably required a larger dataset ? 
> any suggestions are very welcome !
> thanks for your help
> Philippe Guardiola, MD
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 


More information about the R-help mailing list