[R] references on cluster analysis

Giampiero Salvi giampi at speech.kth.se
Sat Feb 7 23:40:36 CET 2004

Hi all,
I'm doing a study on predicting the "true" number of clusters in
a hierarchical clustering scheme. My main reference is at the moment

Milligan GW and Cooper MC (1985) "An examination of procedures for
determining the number of clusters in a data set"
Psychometrika vol 50 no 2 pp 159-179

and all the references included in that paper.

I'm planning to perform a similar comparison on a number of indexes,
but on a much larger data set (in the order of 3000 points), and with
a much higher "true" number of clusters (in the order of some hundreds),
to see if the properties of the indexes scale accordingly.

I was wondering if the set of indexes described in the reference are
still "state of the art" (most of them were introduced in the '60s
and '70s), or if there are new indexes and methods I could include in
my study. I would really appreciate if you could point me to some newer
references addressing this problem.

I also read Milligan's chapter in the book "Clustering and
Classification" from 1995, but didn't find information on this subject
that wasn't included in the previous paper.

Thank you very much,

Giampiero Salvi, M.Sc.          www.speech.kth.se/~giampi
Speech, Music and Hearing       Tel:      +46-8-790 75 62
Royal Institute of Technology   Fax:      +46-8-790 78 54
Drottning Kristinasv. 31,  SE-100 44,  Stockholm,  Sweden

More information about the R-help mailing list