[R] distance in the function kmeans

Gavin Simpson gavin.simpson at ucl.ac.uk
Fri May 28 13:21:44 CEST 2004


Thomas Petzoldt wrote:
> n.bouget wrote:
> 
>> Hi,
>> I want to know which distance is using in the function kmeans
>> and if we can change this distance. Indeed, in the function pam, we 
>> can put a distance matrix in
>> parameter (by the line "pam<-pam(dist(matrixdata),k=7)" ) but
>> we can't do it in the function kmeans, we have to put the
>> matrix of data directly ...
>> Thanks in advance,
>> Nicolas BOUGET
> 
> 
> One solution is to transform the data in a way, that the euclidean 
> distance of the transformed values represents some other distance of the 
> original values. This works at least for the Mahalanobis-Distance, when 
> one applies a multivariate technique to a PCA transformed and re-scaled 
> matrix, but I don't know if there are transformations for some other 
> distance measures.
> 
> Thomas P.
> 

Other solutions from an ecological paper are:

Chord distance
Chi square metric
Chi square distance
Hellinger Distance
Distance between species profiles

All these can be seen as Euclidean distances of some transformation of 
the data.

The paper "Ecologically meaningful transformations for ordination of 
species data" Pierre Legendre, and Eugene D. Gallagher (2001) Oecologia 
Vol. 129, Issue 2, 271-280, explains the concept and how to do the 
transformations.

An R example is given in the help file of decostand() in Jari Oksanen's 
vegan library for two of the transformations mentioned above.

Gav

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                     [T] +44 (0)20 7679 5522
ENSIS Research Fellow             [F] +44 (0)20 7679 7565
ENSIS Ltd. & ECRC                 [E] gavin.simpson at ucl.ac.uk
UCL Department of Geography       [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way                    [W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list