[R] Kmeans cluster analysis

Monica Pisica pisicandru at hotmail.com
Wed Apr 11 15:31:13 CEST 2007


Hi Nataniel,

As far as i know there is a package called clustTool which has a very nice 
interface with the capability to do different cluster analyses. It also 
prodused a plot of each cluster and the mean for each cluster of each 
variable - and i guess this is what you are after! But depending of which 
parameters you are using for the cluster analysis, the package is extremely 
slow if you have more than 5000 datapoints. Maybe you can take the function 
apart to see where and what generates the plot and use that for your 
analysis.

I hope this helps,

Monica Palaseanu-Lovejoy


Message: 35
Date: Tue, 10 Apr 2007 19:51:24 +0000 (GMT)
From: nathaniel Grey <nathaniel.grey at yahoo.co.uk>
Subject: [R] Kmeans cluster analysis
To: r-help at stat.math.ethz.ch
Message-ID: <352480.52445.qm at web23402.mail.ird.yahoo.com>
Content-Type: text/plain

Hello,

I have a data-set containing  22 variables, after appropriate 
transformations etc I  ran  a
kmeans cluster analysis for 4 clusters , I ran it 20 times to find a result 
with the lowest
within sum of squares.

My question is how best do I go about finding out what the characteristics 
are of each cluster?
Is one cluster dominated by a particular set of variables or by a particular 
variable?

The only way I know is to to look at the means for each variable for each 
cluster, but as there
are 22 variables this is time consuming.

Is there a way to graphically represent the clusters in relation to the 
variables...if so I
might need some guidance on the coding as I am new to the R environment.

Any advice and direction would be gratefully received.

best wishes,

Nataniel Grey

_________________________________________________________________

Live! http://clk.atdmt.com/MRT/go/mcrssaub0050001411mrt/direct/01/



More information about the R-help mailing list