[R] Canberra distance

Jari Oksanen jari.oksanen at oulu.fi
Tue Oct 16 16:21:34 CEST 2007


Frédéric Chiroleu <frederic.chiroleu <at> cirad.fr> writes:

> 
> Hi,
> 
> I misunderstand the definition of Canberra distance in R.
> 
> On Internet and in function description pages of dist() from stats and 
> Dist() from amap, Canberra distance between vectors x and y, d(x,y), is :
> 
> d(x,y) = sum(abs(x-y)/(x+y))
> 
> But in use, through simple examples, we find that the formula is :
> 
> d(x,y) = (NZ + 1)/NZ * sum(abs(x-y)/(x+y))
> 
> with NZ = nb of pairs of coordinates that are different from (0,0) (Non 
> Zeros)
> 
I think you must try another example. At least in my simple experiments the
multiplier seemed to be NZ/NZ or one instead of your almost one, and this one
was also the documented case.  I could not find any difference to the
documentation. However, there is a note about "double zeros" (zero denominator
and numerator) in the dist documentation. Could that cause some difference?

If you really want to know how the distance is calculated, download the R source
file and look at there. If you want to know how the index was originally
suggested to be calculated, you must find the Lance & Williams paper in Aust.
Comput. J. 1, 15-20, 1967 (I haven't found it, but would be curious to see it). 

Cheers, jari oksanen



More information about the R-help mailing list