[R] Using statistical test to distinguish two groups

Ralf B ralf.bierig at gmail.com
Wed May 5 19:21:17 CEST 2010


Hi R friends,

I am posting this question even though I know that the nature of it is
closer to general stats than R. Please let me know if you are aware of
a list for general statistical questions:

I am looking for a simple method to distinguish two groups of data in
a long vector of numbers:

list <- c(1,2,3,2,3,2,3,4,3,2,3,4,3,2,400,340,3,2,4,5,6,4,3,6,4,5,3)

I would like to 'learn' that 400,430 are different numbers by using a
simple approach.The outcome of processing 'list' should therefore be:

listA <- c(1,2,3,2,3,2,3,4,3,2,3,4,3,2,3,2,4,5,6,4,3,6,4,5,3)
listB <- c(400,340)

I am thinking a non-parametric test since I have no knowledge of the
underlying distribution. The numbers are time differences between two
actions recorded from a the same person over time. Because the data
was obtained from the same person I would naturally tend to use
Wilcoxon Signed-Rank test. Any thoughts on that?

Are there any R packages that would process such a vector and use
non-parametric methods to split or divide groups based on their
values? Could clustering be the answer given that I already know that
I always have two groups with a significant difference between the
two.

Thanks a lot,
Ralf



More information about the R-help mailing list