[R] kmeans: how to retrieve clusters

Peter Langfelder peter.langfelder at gmail.com
Tue Feb 28 07:46:34 CET 2012


On Mon, Feb 27, 2012 at 3:18 PM, ikuzar <razuki at hotmail.fr> wrote:
> Hello,
>
> I'd like to classify data with kmeans algorithm. In my case, I should get  2
> clusters in output. Here is my data
>
> colCandInd       colCandMed
> 1       82                2950.5
> 2       83               1831.5
> 3       1192     2899.0
> 4       1193     2103.5
>
> The first cluster is the two first lines
> the 2nd cluster is the two last lines
>
> Here is the code:
> x = colCandList$colCandInd
> y = colCandList$colCandMed
> m = matrix(c(x, y), nrow = length(colCandList$colCandInd), ncol=2)
> kres = kmeans(m, 2)
>
> Is there a way to retrieve both clusters in output of the algorithm in order
> to process in each cluster ? (I am looking for smthing like kres$clustList
> ... where I can process each cluster)
>
> kres$cluster did not yield what I expected ...

Not sure what you mean by "process each cluster" and why kres$cluster
is not what you expected. kres$cluster will tell you which cluster
each point (row of your matrix) belongs to. The result depends on how
you initialize the kmeans since the inter-point distances are quite
similar to one another. For example, I get

 > set.seed(10)
>  kres = kmeans(m, 2)
> kres$cluster
[1] 2 2 1 1
> set.seed(1)
> kres = kmeans(m, 2)
> kres$cluster
[1] 1 1 2 2
> set.seed(200)
> kres = kmeans(m, 2)
> kres$cluster
[1] 2 2 1 1
> kres = kmeans(m, 2)
> kres$cluster
[1] 1 2 1 2

So 3 times out of 4 I get the result you expect, and once a different one.

If you need the result in a different format, that should be no problem.



More information about the R-help mailing list