[R] distance in the function kmeans

Thomas Petzoldt thpe at hhbio.wasser.tu-dresden.de
Fri May 28 14:47:10 CEST 2004


n.bouget at laposte.net wrote:

 > I don't exactly understand what you do, could you show me the
 > program that you execute to do that?

I did such things sometimes ago, so the following is (as usual) without
warranty. There are several methods, e.g. using Choleski factorization,
singular value decomposition or principal components. Given "mdata" as
original data matrix it works with hclust and should be applicable to
kmeans too:

# with svd
z <- svd(scale(mdata, scale=F))$u
cl <- hclust(dist(z), method="ward")

# with princomp (rescaled)
pc <- princomp(mdata, cor=FALSE)
pcdata <- as.data.frame(scale(pc$scores))
cl <- hclust(dist(pcdata), method="ward")


... but as I mentioned, this is only an example, that methods working
with the Euclidean distance can be applied to other distance measures,
when an appropriate transformation of the data exist and, according to
Gavin, there are indeed some other possibilities.

Thomas P.




More information about the R-help mailing list