[R] kmeans cluster stability

Marc Feldesman feldesmanm at pdx.edu
Tue Mar 13 22:36:07 CET 2001


I'm doing kmeans partitioning on a small (n=26) dataset that has 5 
variables.  I noticed that if I repeatedly run the same command, the 
cluster centers change and the cluster membership changes.

Using RW1022 under Windows NT & Windows 2000

 >kmeans(pottery[,1:5], 4, 20)

[...snip]
$size
[1] 7 3 9 7
[...snip]
$size
[1]  7 10  4  5
[...snip]
$size
[1]  6 10  5  5

yields a different answer every time a run it.  Sometimes the answer is 
different only in the order of withinss (and the ordering of the numbers of 
cases assigned to each group).  Other times there are completely different 
centers, withinss and completely different cluster configurations.  This 
variability doesn't happen in either S-Plus 2000 or S-Plus 6.0 (Beta 2).

I can see from the help that the R kmeans() function chooses a random set 
of rows as cluster centers if the initial centers aren't specified, while 
S-Plus uses hclust() and cutree() to determine the initial clusters.

Is there any way to "make" kmeans results persist under repeated uses of 
the same command?

Thanks,




=====================
Dr. Marc R. Feldesman
Professor and Chairman
Anthropology Department
Portland State University
1721 SW Broadway
Portland, Oregon 97201
email:  feldesmanm at pdx.edu
phone:  503-725-3081
fax:    503-725-3905
http://web.pdx.edu/~h1mf
PGP Key Available On Request
======================

"Anyway, no drug, not even alcohol, causes the fundamental ills of society.
If we're looking for the source of our troubles, we shouldn't test people
for drugs, we should test them for stupidity, ignorance, greed and love of
power."   P.J. O'Rourke

Powered by Optiplochoerus and Windows 2000 (scary isn't it?)

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list