[R] Clustering of datasets

Jim Lemon drj|m|emon @end|ng |rom gm@||@com
Tue Sep 6 13:06:25 CEST 2022


Hi Subhamitra,
I've had a look at this and made some guesses about what you might be
trying to do. If I create a data frame with country, integration and
efficiency, I get a reasonable looking three cluster solution. This may be
completely wrong as far as what you want. When you plot the clusters of the
separate measures, the "Index" values are just the order of the countries
in the data frame. I can't see how this means anything unless you have
ordered the countries on some measure unknown to me. Also, I'm unsure of
what the two measures you are using represent. This may give you a start on
getting sensible clusters. Let me know how you go with it.

# create a data frame with both measures
DMs<-data.frame(Country=DMs1$Country,Integration=DMs1$DATA,
 Efficiency=DMs2$Data)
# perform the clustering
km<-kmeans(DMs[,2:3],centers=3)
# plot the result
plot(DMs$Integration,DMs$Efficiency,
 main="DM clusters by Integration and Efficiency",
 xlab="Integration",ylab="Efficiency",pch=19,
 col=km$cluster)
text(DMs$Integration,DMs$Efficiency+0.03,DMs$Country,col=km$cluster)
points(km$centers,pch=rep(19,3),cex=3,col=1:3)
legend(-0.3,-0.1,
 c("cluster 1","cluster 2","cluster 3"),
 col=1:3,pch=19)

Jim

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sp_km1.png
Type: image/png
Size: 32739 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20220906/7476a136/attachment.png>


More information about the R-help mailing list