[R] Clustering of datasets

Mon Sep 5 14:06:37 CEST 2022

Hi Subhamitra,
I think the fact that you are passing a vector of values rather than a
matrix is part of the problem. As you have only one value for each
country, The points plotted will be the index on the x-axis and the
value for each country on the y-axis. Passing a value for ylim= means
that you are cutting off the lowest points. Here is an example that
will give you two clusters and show the values for the centers in the
middle of the plot. Perhaps this is all you need, but I suspect there
is more work to be done.

k2<-kmeans(DMs[,2],centers=2)
plot(DMs[,2],col=k2$cluster,pch=19,xlim=c(1,46))
text(1:46,DMs[,2],DMs[,1],col=k2$cluster)
points(rep(23,2),k2$centers,pch=1:2,cex=2,col=k2$cluster)
legend(10,1,c("cluster 1: Highly Integrated","cluster 2: Less Integrated"),
col=1:2,pch=19)

Jim

On Mon, Sep 5, 2022 at 9:31 PM Subhamitra Patra
<subhamitra.patra using gmail.com> wrote:
>
> Dear all,
>
> I am about to cluster my datasets by using K-mean clustering techniques in
> R, but getting some type of scattered results. Herewith I pasted my code
> below. Please suggest to me where I am lacking in my code. I was pasting my
> data before applying the K-mean method as follows.
>
> DMs<-read.table(text="Country DATA
>                       IS -0.0092
>                       BA -0.0235
>                       HK -0.0239
>                       JA -0.0333
>                       KU -0.0022
>                       OM -0.0963
>                       QA -0.0706
>                       SK -0.0322
>                       SA -0.1233
>                       SI -0.0141
>                       TA -0.0142
>                       UAE -0.0656
>                       AUS -0.0230
>                      BEL -0.0006
>                      CYP -0.0085
>                      CR  -0.0398
>                     DEN  -0.0423
>                       EST -0.0604
>                       FIN -0.0227
>                       FRA -0.0085
>                      GER -0.0272
>                      GrE -0.3519
>                      ICE -0.0210
>                      IRE -0.0057
>                      LAT -0.0595
>                     LITH -0.0451
>                     LUXE -0.0023
>                     MAL  -0.0351
>                     NETH -0.0048
>                       NOR -0.0495
>                       POL -0.0081
>                     PORT -0.0044
>                     SLOVA -0.1210
>                     SLOVE -0.0031
>                       SPA -0.0213
>                       SWE -0.0106
>                     SWIT -0.0152
>                       UK -0.0030
>                     HUNG -0.0086
>                       CAN -0.0144
>                     CHIL -0.0078
>                       USA -0.0042
>                     BERM -0.0035
>                     AUST -0.0211
>                     NEWZ -0.0538" ,
>                  header = TRUE,stringsAsFactors=FALSE)
> library(cluster)
> k1<-kmeans(DMs[,2],centers=2,nstart=25)
> plot(DMs[,2],col=k1$cluster,pch=19,xlim=c(1,46), ylim=c(-0.12,0))
> text(1:46,DMs[,2],DMs[,1],col=k1$cluster)
> legend(10,1,c("cluster 1: Highly Integrated","cluster 2: Less Integrated"),
> col=1:2,pch=19)
>
>
> --
> *Best Regards,*
> *Subhamitra Patra*
> *Phd. Research Scholar*
> *Department of Humanities and Social Sciences*
> *Indian Institute of Technology, Kharagpur*
> *INDIA*
>
> [image: Mailtrack]
> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&>
> Sender
> notified by
> Mailtrack
> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&>
> 09/05/22,
> 04:55:22 PM
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.