dist(*, "euclidean") [was "dist function suggestion"]

Prof Brian Ripley Prof Brian Ripley <ripley@stats.ox.ac.uk>
Wed, 20 Jan 1999 13:48:53 +0000 (GMT)

>     BDR> You will need to call it something else: dist is a clone of an S
>     BDR> function, and dist(X, "manhattan") is well-established usage.
> one could still imagine an extra Y argument such that
> 	dist(X, Y=myY, method="euclidean")
> and	dist(X, "euclidean", Y=myY)
> would work
> one could even make it such that
> both
> 	dist(X, myY)
> and	dist(X, "euclidean")
> would work.  However, the extra hack for 	dist(X, Y)

Um, I did think about that, but precisely what would be rules be?
For now the second argument can be anything coercable to a character
(why is match.arg not used?), or should be since ?pmatch says

       x: the values to be matched.

without mentioning mode, although the internal code has isString
(whatever precisely that is).  In particular, names fail in R but not
S, although as documented they should work in R but not S:

> pmatch(as.name("test"), "test")
Error in pmatch(x, table, duplicates.ok) : argument is not of mode character

So at present if the second argument is a character matrix it works.
And there is an as.matrix on X.  So what do you want Y to be?
numeric?  (At present I can have X as a data frame with factors, e.g.
dist(data.frame(factor(1:10)))!)  matrix?  ?

While I am at it

> X <- c(Inf, 0, 0)
> library(mva)
> dist(X)
Error: NAs in foreign function call (arg 1)

is confusing: the error message is caused by Infs too.  (I found this
when writing a version of dist to handle NAs as S's does: the bare code is
now an example in the V&R R complements.)


