[R] offlist comment Re: KS Test question (2)

Ralf B ralf.bierig at gmail.com
Thu Aug 5 10:10:34 CEST 2010


This is unbelievable. Now people like yourself start doing background
searches on one and accusing one of not being professional plus
posting cheeky R code. The reason why I submitted the questions I have
submitted was that these answers did not satisfy my particular problem
(or perhaps I mistakenly thought so). The point here is that the forum
should be a forum where one should be allowed to ask questions without
first studying the history of the the entire forum in fear that
someone might have asked it before. I was hoping that I could find
clearer answers then what I was able to read. I do know how to search
in Google. But I am not an expert in statistics, as you already found
in your background check. If I would be fluent in stastitsics and R
and if past answers would have exactly satisfied my problem I would
not post here and I certainly would not have occupied your expensive
attention.





On Wed, Aug 4, 2010 at 6:16 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Aug 4, 2010, at 5:49 PM, Ralf B wrote:
>
>> Hi R Users,
>>
>> I have two vectors, x and y, of equal length representing two types of
>> data from two studies. I would like to test if they are similar enough
>> to use them interchangeably. No assumptions about distributions can be
>> made (initial tests clearly show that they are not normal).
>> Here some result:
>>
>> Two-sample Kolmogorov-Smirnov test
>>
>> data:  x and y
>> D = 0.1091, p-value < 2.2e-16
>> alternative hypothesis: two-sided
>>
>> Warning message:
>> In ks.test(x[1:nx], y[1:nx], exact = FALSE) :
>>  cannot compute correct p-values with ties
>>
>> Here some questions:
>>
>> a) What does the error message means and what does it imply?
>> b) The data is very noisy and the initial result shows that there is
>> no relation between x and y. Is there a way to calculate and effect
>> size?
>> c) Can the p-value be used, when running tests over a large amount of
>> different data sets, as a metric for ranking similarity between x and
>> y data sets?
>
> There has been quite a bit of discussion on this list over the years about
> why KS test is not good in this situation. If I read the results of a search
> on your name correctly, you are in a department of Information Sciences. I
> would have thought that the first reaction of someone in that field would be
> do do a search on a question. Why are you filling up the archives with
> questions that have been repeatedly asked and  answered?
>
> Do you need help in this area?
>
> rhelpSearch <- function(string,
>                  restrict = c("Rhelp10", "Rhelp08", "Rhelp02", "functions"
> ),
>                  matchesPerPage = 100, ...)
>         RSiteSearch(string=string,  restrict = restrict,  matchesPerPage =
> matchesPerPage, ...)
>
>
> rhelpSearch("KS.test ties p-value")
>
>>
>> Best
>> R.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
>



More information about the R-help mailing list