[R] Dynamic clustering?

Greg Snow Greg.Snow at imail.org
Thu May 6 21:28:39 CEST 2010


You could do a hierarchical clustering, then look at the height of the last combination relative to the other heights, for your data:

> tmp <- hclust( dist( c(1,2,3,2,3,1,2,3,400,300,400) ) )
> tmp2 <- hclust( dist( c(400,402,405, 401,410,415, 407,412) ) )
> tmp$height
 [1]   0   0   0   0   0   0   1   2 100 399
> tmp2$height
[1]  1  2  2  2  5  7 15

You still need to make some assumptions and come up with a method for choosing a cutoff, but this may help get you started.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Ralf B
> Sent: Wednesday, May 05, 2010 3:18 PM
> To: r-help at r-project.org
> Subject: [R] Dynamic clustering?
> 
> Are there R packages that allow for dynamic clustering, i.e. where the
> number of clusters are not predefined? I have a list of numbers that
> falls in either 2 or just 1 cluster. Here an example of one that
> should be clustered into two clusters:
> 
> two <- c(1,2,3,2,3,1,2,3,400,300,400)
> 
> and here one that only contains one cluster and would therefore not
> need to be clustered at all.
> 
> one <- c(400,402,405, 401,410,415, 407,412)
> 
> Given a sufficiently large amount of data, a statistical test or an
> effect size should be able to determined if a data set makes sense to
> be divided i.e. if there are two groups that differ well enough. I am
> not familiar with the underlying techniques in kmeans, but I know that
> it blindly divides both data sets based on the predefined number of
> clusters. Are there any more sophisticated methods that allow me to
> determine the number of clusters in a data set based on statistical
> tests or effect sizes ?
> 
> Is it possible that this is not a clustering problem but a
> classification problem?
> 
> Ralf
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list