[R] k-means clustering

Ranjan Maitra maitra at iastate.edu
Mon Mar 19 23:15:56 CET 2007


kmeans(y[,c("AGE", "PRODUCTS")], 3) should do what I think you want. 

Note that you should try several starting points for good optimality of the partitioning.

HTH,
Ranjan

On Mon, 19 Mar 2007 20:12:10 +0100 "Sergio Della Franca" <sergio.della.franca at gmail.com> wrote:

> Dear R-helpers,
> 
> I'm trying to perform k-means clustering.
> 
> For example, I have this dataset(y):
> 
>   AGE   PRODUCTS  SEX
>   92          3253           M
>   43          4144           F
>   67          3246           M
>   22          4144           F
>   56          4087           F
>   89          3836           M
>   47          4379           M
> 
> My situation is the following:
> - If i use this code: cluster<-kmeans(y,3), the program doesn't run because
> the variable "SEX" isn't numeric.
> - If i use this code: cluster<-kmeans(y[,{"AGE"}],3), the program run
> correctly.
> - If i use this code: cluster<-kmeans(y[,{"AGE" ; "PRODUCTS"}],3), the
> program run correctly, but the k-means clustering is performed only on the
> variable "PRODUCTS".
> 
> I would like to perform the k-means clustering on the two numeric variable i
> have.
> How can i modify the k-means code to develop the clustering on numeric
> variable that i decide to use?
> 
> 
> Thank you in advance.
> 
> Sergio Della Franca.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list