[R] dixon test

laut timothyslau at gmail.com
Fri Nov 16 22:58:57 CET 2012


I would like to extend Dixon's values beyond 30.  I've read over Rorabacher
article but didn't understand the equations well enough to convert them to
Excel and then "drag" the cells out extending the n.
Rorabacher,_1991.pdf
<http://r.789695.n4.nabble.com/file/n4649819/Rorabacher%2C_1991.pdf>  
Dean_&_Dixon,_1951.pdf
<http://r.789695.n4.nabble.com/file/n4649819/Dean_%26_Dixon%2C_1951.pdf>  
Dixon'sQ.xlsx <http://r.789695.n4.nabble.com/file/n4649819/Dixon%27sQ.xlsx> 
.
Fernando Marmolejo-Ramos wrote
> 
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> 
> hi giov
> 
> about the dixon test... i just run a simple test with a sample of 40 and I
> got:
> 
> Error in dixon.test(x) : Sample size must be in range 3-30
> 
> So it seems that most of the test in the "outliers" package are designed
> for small samples. See also the Rnews article published in May 2006 (vol
> 6/2) called "processing data for outliers" by Lukasz Komsta (the developer
> of the package).
> 
> However there is in that package a function called "scores" which works
> for big samples. You can also see the p-values and z scores for the
> observations you have and determine which values are considered outliers.
> 
> Try this simple syntax:
> 
> library(outliers)
> library(gamlss.dist)
> 
> # this produces a exponential+Gaussian distribution (which usually has
> heaps of outliers!)
> x <- rexGAUS(100,2000,3000,5000)
> 
> # this confirms that Dixon works for samples between 3 and 30!!!
> dixon.test(x)
> 
> # just to see what the data set looks like and visually confirm the
> outliers
> boxplot(x, notch=T)
> 
> # sort the scores in ascending order
> sort(x)
> 
> # returns probability of each score (using z scores) to be an outlier in
> order
> sort(scores(x, type="z", prob=1))
> 
> # determines which scores are considered outliers with a 95% confidence
> sort(scores(x, prob=0.95))
> 
> The author points regarding the "prob" part...
> 
> prob ---- If set, the corresponding p-values instead of scores are given.
> If value is set to 1, p-value are returned. Otherwise, a logical vector is
> formed, indicating which values are exceeding specified probability. In
> "z" and "mad" types, there is also possibility to set this value to zero,
> and then scores are confirmed to (n-1)/sqrt(n) value, according to
> Shiffler (1998). The "iqr" type does not support probabilities, but "lim"
> value can be specified. 
> 
> The reference of Shiffler is not as the one that appears in the help. It
> is this one:
> 
> Schiffler, R.E (1988). Maximum Z scores and outliers. Am. Stat. 42, 1,
> 79-80. 
> 
> I hope this helps,
> 
> Fernando





--
View this message in context: http://r.789695.n4.nabble.com/dixon-test-tp864308p4649819.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list