[R] detection of outliers

Gabor Grothendieck ggrothendieck at myway.com
Thu Sep 23 16:52:05 CEST 2004


 <Phguardiol <at> aol.com> writes:

: 
: Hi,
: this is both a statistical and a R question...
: what would the best way / test to detect an outlier value among a series of 
10 to 30 values ? for instance if we
: have the following dataset: 10,11,12,15,20,22,25,30,500 I d like to have a 
way to identify the last data
: as an outlier (only one direction). One way would be to calculate abs(mean - 
median) and if elevated (to
: what extent ?) delete the extreme data then redo.. but is it valid to do so 
with so few data ? is the (trimmed
: mean - mean) more efficient ? if so, what would be the maximal tolerable 
value to use as a threshold ? (I guess
: it will be experiment dependent...) tests for skweness will probably 
required a larger dataset ? 
: any suggestions are very welcome !
: thanks for your help
: Philippe Guardiola, MD


If z is your vector the following all detect outliers:

	boxplot(z)  # will show the outlier

	plot(lm(z ~ 1))  # the various plots show this as well

	require(car)
	outlier.test(lm(z ~ 1)) # tests most extreme value




More information about the R-help mailing list