[R] references on cluster analysis

Christian Hennig fm3a004 at math.uni-hamburg.de
Mon Feb 23 10:30:23 CET 2004


Hi,

> Martin Maechler wrote:
> 
> > Back from my vacation, I haven't seen an R-help answer on this
> >   (Christian, where have you been ? ;-)

(Uh, I missed this one. Too much spam?)

I would add information based criteria (AIC, BIC and so on)
together with a normal mixture model (implemented in package mclust). Four
of these criteria are compared in Celeux and Soromenho, An Entropy
Criterion for Assessing the Number of Clusters in a Mixture Model, Journal
of Classification 13, 195-212 (1996) along with more references.

Note that there are also a number of clustering approaches in the recent
literature that decide about the number of clusters implicitly (not via
optimizing over all cluster numbers), e.g., DBSCAN. 

Christian

> >>>>>>"GiampS" == Giampiero Salvi <giampi at speech.kth.se>
> >>>>>>    on Sat, 7 Feb 2004 23:40:36 +0100 (CET) writes:
> > 
> > 
> >     GiampS> Hi all, I'm doing a study on predicting the "true"
> >     GiampS> number of clusters in a hierarchical clustering
> >     GiampS> scheme. My main reference is at the moment
> > 
> >     GiampS> Milligan GW and Cooper MC (1985) "An examination of
> >     GiampS> procedures for determining the number of clusters in
> >     GiampS> a data set" Psychometrika vol 50 no 2 pp 159-179
> > 
> >     GiampS> and all the references included in that paper.
> > 
> >     GiampS> I'm planning to perform a similar comparison on a
> >     GiampS> number of indexes, but on a much larger data set (in
> >     GiampS> the order of 3000 points), and with a much higher
> >     GiampS> "true" number of clusters (in the order of some
> >     GiampS> hundreds), to see if the properties of the indexes
> >     GiampS> scale accordingly.
> > 
> >     GiampS> I was wondering if the set of indexes described in
> >     GiampS> the reference are still "state of the art" (most of
> >     GiampS> them were introduced in the '60s and '70s), or if
> >     GiampS> there are new indexes and methods I could include in
> >     GiampS> my study. I would really appreciate if you could
> >     GiampS> point me to some newer references addressing this problem.

***********************************************************************
Christian Hennig
Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
#######################################################################
ich empfehle www.boag-online.de




More information about the R-help mailing list