[Rd] Erratic behaviour of sammon()

Peter Kleiweg kleiweg@let.rug.nl
Thu, 1 Nov 2001 22:15:40 +0100 (CET)


I'm not sure this list is the right place for this thing.

I noticed some erratic behaviour in sammon(). Running sammon on
two nearly identical sets of data results in very different
results. Below is an example. I create an initial configuration
with cmdscale() and store it into 'vec1'. I write this to file,
and read it back in again to 'vec2'. According to cor() on the
three columns of 'vec1' and 'vec2', they are identical. However,
if I use sammon() with initialising from 'vec1' or 'vec2', I get
different results. (SAMMON() is a wrapper function).

This I did on a Linux machine (R version 1.3.1):

	> dst <- ReadDistFile("PA.lnk")
	Loading required package: mva
	> vec1 <- MDS(dst, 3)
	> WriteVectorFile(vec1, "outfile")
	> vec2 <- ReadVectorFile("outfile")
	> cor(vec1[,1], vec2[,1])
	[1] 1
	> cor(vec1[,2], vec2[,2])
	[1] 1
	> cor(vec1[,3], vec2[,3])
	[1] 1
	> v1 <- SAMMON(dst, 3, y=vec1)
	Loading required package: MASS
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  30 iters: 0.05346, magic = 0.491
	stress after  40 iters: 0.04985, magic = 0.500
	stress after  50 iters: 0.04945, magic = 0.500
	stress after  60 iters: 0.04931, magic = 0.500
	stress after  70 iters: 0.04925, magic = 0.500
	> v2 <- SAMMON(dst, 3, y=vec2)
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  30 iters: 0.05369, magic = 0.491
	stress after  30 iters: 0.05369
	> cor(v1[,1], v2[,1])
	[1] 0.958089
	> cor(v1[,2], v2[,2])
	[1] 0.979837
	> cor(v1[,3], v2[,3])
	[1] 0.9412055

I also tried it on HP-UX, and got different results again:

	> dst <- ReadDistFile("PA.lnk")
	Loading required package: mva
	> vec1 <- MDS(dst, 3)
	> WriteVectorFile(vec1, "outfile")
	> vec2 <- ReadVectorFile("outfile")
	> cor(vec1[,1], vec2[,1])
	[1] 1
	> cor(vec1[,2], vec2[,2])
	[1] 1
	> cor(vec1[,3], vec2[,3])
	[1] 1
	> v1 <- SAMMON(dst, 3, y=vec1)
	Loading required package: MASS
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  28 iters: 0.06761
	> v2 <- SAMMON(dst, 3, y=vec2)
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  30 iters: 0.06719, magic = 0.020
	stress after  40 iters: 0.06115, magic = 0.009
	stress after  50 iters: 0.05198, magic = 0.500
	stress after  60 iters: 0.04968, magic = 0.500
	stress after  70 iters: 0.04933, magic = 0.500
	stress after  80 iters: 0.04924, magic = 0.225
	> cor(v1[,1], v2[,1])
	[1] 0.9106865
	> cor(v1[,2], v2[,2])
	[1] 0.9727502
	> cor(v1[,3], v2[,3])
	[1] 0.9411287

I even tried compiling with gcc with and without optimisation,
and I got different results for exactly the same input (no
saving to file first).

So, I gather that sammon() is an unstable function, extremely
sensitive to the tiniest of variations. Is this inherent to the
sammon algorithm, or is there something wrong with how it is
implemented in R?

-- 
Peter Kleiweg
http://www.let.rug.nl/~kleiweg/

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._