[R] Kolmogorov-Smirnoff test

(Ted Harding) Ted.Harding at manchester.ac.uk
Tue Nov 6 17:23:37 CET 2007


On 06-Nov-07 15:53:53, Oarabile Molaodi wrote:
> I am trying to determine whether two samples are identical or not.
> I'm aware that somebody can use the Kolmogorov-Smirnoff test to
> compare empirical distributions, but since my samples have ties
> I'm not sure if I'm getting the right p-values for the comparison.
> Can the Kolmogorov-Smirnoff test be adjusted for the case when ties
> exists and are there any functions that already exists in R
> (Kolmogorov-Smirnoff test )performing  that can be used in the case
> of the existance of ties?
> 
> Thank you in advance for your help.
> 
> Oarabile

Tests like the Kolmogorov-Smirnov whose theoretical null
distribution assume continuous random variables (hence
wothout ties) do not have definite null distributions
when ties are possible. Whatever null distribution the
test may have when ties are present (e.g. due to data
being recorded to a relatively coarse precision) will
depend on the pattern of ties.

However, it is possible to investigate the effect of
ties on the P-value by randomly breaking ties.

For instance, suppose your data are recorded to a precision
of 0.1, and you have two such samples X and Y, then let

X.rand <- X + 0.0001*runif(length(X)
Y.rand <- Y + 0.0001*runif(length(Y)

and then do a K-S test on X.rand vs Y.rand.

You will get a P-value. Repeat this many time. You will get
a distribution of P-values. You can extract any relevant
property of this distrobution of P-values, for instance
its mean, it's 95th percentile (so you can be 96% confident
that the tie-broken P-value is less than this value).
and so on.

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 06-Nov-07                                       Time: 16:23:34
------------------------------ XFMail ------------------------------



More information about the R-help mailing list