[Rd] PR#4195

ripley at stats.ox.ac.uk ripley at stats.ox.ac.uk
Sun Nov 9 14:25:40 MET 2003


I've taken a look at this. What the R code does is to recalculate the
nearest neighbours & distances after updating the distances, for all
clusters other than the new one which it attempted to do on the fly.  
The problem is that merging two clusters can make distances to the cluster
go up and so what was a nearest neighbour may stop being so, as well as
the reverse.  So I am not convinced that the correction is in fact enough
(what if i2 had previously been the nearest neighbour of k?) although this 
may not affect the later steps.

It is as fast just to update all nearest neighbours, and I have changed
the R code to do so. It is a lot easier to convince oneself that the code
gives the correct answer! I now get an answer much closer to yours and
hope the difference is due to rounding errors in dumping the dataset. (It
is also what I got by incorporating the fix in the C code you pointed us
to.)

BDR

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list