[R] Use clusters.stats function from a hierarchical clustering in R

Jovani T. de Souza jov@n|@ouz@5 @end|ng |rom gm@||@com
Thu Dec 10 22:20:07 CET 2020


I would like a great help from you. I used the cluster.stats function that
is part of the `fpc` package to compare the similarity of two custer
solutions using a variety of validation criteria, as you can see in the
code. However, I have two questions:

1 ° Is it possible to know which is the most viable cluster, 2 clusters or
5 clusters? If so, could you explain me better how I can know.

2º Does this package only compare two in two cluster solutions, or is it
possible to compare two more cluster solutions at once?

Thank you so much!

Best Regards.

    library(rdist)
    library(geosphere)
    library(fpc)


    df<-structure(list(Industries = c(1,2,3,4,5,6),
                       Latitude = c(-23.8, -23.8, -23.9, -23.7,
-23.7,-23.7),
                       Longitude = c(-49.5, -49.6, -49.7, -49.8,
-49.6,-49.9),
                       Waste = c(526, 350, 526, 469, 534, 346)), class =
"data.frame", row.names = c(NA, -6L))

    df1<-df

    #clusters
    coordinates<-df[c("Latitude","Longitude")]
    d<-as.dist(distm(coordinates[,2:1]))
    fit.average<-hclust(d,method="average")

    clusters<-cutree(fit.average, k=2)
    df$cluster <- clusters

    clusters1<-cutree(fit.average, k=5)
    df1$cluster <- clusters1

    cluster.stats(d,df$cluster,df1$cluster)



[image: Mailtrack]
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
Remetente
notificado por
Mailtrack
<https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5&>
10/12/20
18:19:59

	[[alternative HTML version deleted]]



More information about the R-help mailing list