[R] gower distance calculation

Gavin Simpson gavin.simpson at ucl.ac.uk
Mon Nov 20 10:17:28 CET 2006


On Sun, 2006-11-19 at 23:47 -0400, Tyler Smith wrote:
> Gavin Simpson wrote:
> 
> > vegdist in package vegan has Gower's distance, but all variables have to
> > be numeric.
> 
> > If you want to use mixed data (numerics, factors, binary), see ?daisy in
> > package cluster.
> 
> This is a little unclear. vegdist will handle regular quantitative
> variables as well as binary variables. This is not so much a feature
> of vegdist as of the Gower similarity, which treats binary and
> quantitative variables identically, since a simple matching
> coefficient produces the same similarity value as is produced by
> Gower's  quantitative similarity function for a variable that can take
> only two values.
> 
> Perhaps that's what you meant, and I just misunderstood you. Perhaps
> I'm wrong, and someone will correct me!
> 
> Cheers,
> 
> Tyler

The way I have seen this presented, places special emphasis on binary
matches, and presents them as different from quantitative matches. 

Having thought this through a bit more, I see that numerically they are
the same, but only if the both absent situation is handled symmetrically
(as in Gower's 1971 paper). If double zero is handled asymmetrically (if
Xij and Xik = 0, then Sjk = 0, not 1), then they are not equivalent.

I'm an ecologist and double zeroes are often handled differently because
if a species is absent from two samples, it could be because you didn't
look hard enough in one or both samples, or that the environments in
both samples are not optimal for that species, but in different ways
(one too acid, one to alkaline, one too wet, one too dry, etc.) - in
both cases, both being 0 should not (necessarily) be taken as a sign of
similarity. So I always remember binary as being a special case because
of this.

So it was me being a bit ecology-centric in my reply. Thanks for
clarifying what I had said.

All the best,

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                     [t] +44 (0)20 7679 0522
ECRC                              [f] +44 (0)20 7679 0565
UCL Department of Geography
Pearson Building                  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street
London, UK                        [w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT                          [w] http://www.freshwaters.org.uk/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list