[R] KS Test question (2)

Thu Aug 5 00:29:47 CEST 2010

On Aug 4, 2010, at 5:49 PM, Ralf B wrote:

> Hi R Users,
>
> I have two vectors, x and y, of equal length representing two types of
> data from two studies. I would like to test if they are similar enough
> to use them interchangeably. No assumptions about distributions can be
> made (initial tests clearly show that they are not normal).
> Here some result:
>
> Two-sample Kolmogorov-Smirnov test
>
> data:  x and y
> D = 0.1091, p-value < 2.2e-16
> alternative hypothesis: two-sided
>
> Warning message:
> In ks.test(x[1:nx], y[1:nx], exact = FALSE) :
>  cannot compute correct p-values with ties
>
> Here some questions:
>
> a) What does the error message means and what does it imply?

a) It is not an error message.
b) It does seem rather self-explanatory.

> b) The data is very noisy and the initial result

What "initial result"?

> shows that there is
> no relation between x and y. Is there a way to calculate and effect
> size?

An "effect size" implies some sort of statistical model. You have not  
offered one yet.

> c) Can the p-value be used, when running tests over a large amount of
> different data sets, as a metric for ranking similarity between x and
> y data sets?

Not in a useful way. The p-value for KS.test large datasets will  
always be small but that information does not characterize the  
differences in distribution in any meaningful way. Many similar  
questions have been posted and answered over the years on r-help.
>
> Best
> R.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT