[R] Why CLARA clustering method does not give the same classes as when I do clustering manually?

Sarah Goslee sarah.goslee at gmail.com
Fri Feb 19 20:46:30 CET 2016


clara() is a version of pam() adapted to use large datasets.

pam() uses the entire dataset, and should give results identical to
your manual procedure, or nearly so. clara() works on subsets of the
data, so it may give a slightly different result each time you run it.

The default parameters for clara() are very small, so you can get
substantially different results from run to run on a large dataset if
you don't change them.

Sarah

On Fri, Feb 19, 2016 at 6:30 AM, ABABAEI, Behnam
<Behnam.ABABAEI at limagrain.com> wrote:
> Hi,
>
>
> I am using CLARA (in 'cluster' package). This method is supposed to assign each observation to the closest 'medoid'. But when I calculate the distance of medoids and observations manually and assign them manually, the results are slightly different (1-2 percent of occurrence probability). Does anyone know how clara calculates dissimilarities and why I get different clustering results?
>
>
> Behnam.



More information about the R-help mailing list