[R] Transformation of dissimilarity or distance matrix

Jan_Svatos@eurotel.cz Jan_Svatos at eurotel.cz
Wed May 30 13:35:11 CEST 2001


Thanks to Prof. Brian Ripley
for a quick and useful answer.
I was wrong, as I thought about some measure of "similarity" instead of
dissimilarity.
Of course, with dissimilarities and distances makes my question no sense.
The call of hclust() , as well as dist() with 3000 x 3 matrix succeded at

platform i386-pc-mingw32
arch     x86
os       Win32
system   x86, Win32
status
major    1
minor    2.3
year     2001
month    04
day      26
language R

with 256M of memory,

but agnes() failed due to total allocation memory exhausted.

Jan



                                                                 
                                                                 
              Prof Brian Ripley <ripley at stats.ox.ac.uk>          
              05/30/2001 12:45 PM                                
                                                                 



Re: [R] Transformation of dissimilarity or distance matrix

On Wed, 30 May 2001 Jan_Svatos at eurotel.cz wrote:

>
> Dear List,
>
> is there an elegant (or even not elegant) way how to transform
> dissimilarity or distance matrix A
> (or, in general, arbitrary symmetrical matrix) by transposition of rows
and
> columns into a form
> closest to "block diagonal" matrix B?
> The matrix A is adjusted the following way
>
> A[A<epsilon] <-0 #(epsilon is given "small" number)
>
> B: (in its ideal form)
>
> b_{11}...b_{1i} 0...0
> ...
> b_{i1}...b_{ii}
> 0...0          b_{i+1, i+1}
>
> etc,
> with "reasonable" number of blocks.
> Dimensions of this problem: about 3000 rows (given) and about 30-45
blocks
> (expected).
>
> If there is some function for this task in "Matrix, multiv,..." packages,
> then RTFM is a perfectly good answer.

This makes no sense to me.  In your `approximation' most pairs of objects
are at zero distance from each other.  With dissimilarity or distance
matrices the entries are small within clusters and large between clusters,
and in particular the diagonal is zero.

For the reverse problem, with blocks of zeroes on the diagonal, any
reasonable clustering algorithm will given you a permutation with
the small entries near and on the diagonal.  For example, hclust in
package mva.  *However*, I doubt if anything in R will be very happy with
a distance matrix on 3000 rows unless you have a lot of memory.

--
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595






-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list