[BioC] linkage distances

David Ruau David.Ruau at rwth-aachen.de
Wed Jun 13 21:36:17 CEST 2007

For a good source of information on linkage methods you should have a  
look at this book:
"Finding Groups in Data. An introduction to cluster analysis"
from L. Kaufman and P. J. Rousseeuw
at Wiley

This is a really easy book to read.
For understanding linkage methods look at chapter 5, page 199. An  
explanation is given page 225 also.
Have also a look for a quick overview on page 47. All the method  
describe in this book are implemented into the package 'cluster'
In the end, for the linkage method, I always use the same: UPGMA also  
call average method.
In the book they mention that you have to choose the linkage method  
according to the type of cluster shape you search.
I never found the answer to the cluster shape when your matrix has  
more than 3 dimension... :)

What I play with is the distance/similarity measure.
When speaking about distance you should make a difference between  
Metric (euclidean...), parametric and non-parametric.
Parametric correlation measures can, due to their sensitivity to  
outliers, give non-homogeneous cluster solutions. In this case non- 
parametric correlations, such as Spearman Rank correlation or  
Kendall’s t rank correlation, are preferred.
The distance use by Eisen in his paper of 1998 is the cosine distance  
correlation also call not centered Pearson. And it give good results.

David Ruau
Institute for Biomedical Engineering
-Cell Biology-
Universitatsklinikum Aachen, RWTH
Pauwelsstrasse 30
52074 Aachen
GPG: 4210CA11

On Jun 13, 2007, at 3:13 PM, Daniel Brewer wrote:

> Hi,
> I have been producing some dendograms using hclust with a variety of
> linkage distance measures.  Does anyone know or is there a good  
> resource
> that explains why one would use one linkage distance rather than  
> another?
> I don't really like dealing with dendograms, but we want to produce
> groupings based on these to do differential analysis on, and I would
> like to be able to justify it.
> Thanks
> Dan
> -- 
> **************************************************************
> Daniel Brewer, Ph.D.
> Institute of Cancer Research
> Email: daniel.brewer at icr.ac.uk
> **************************************************************
> The Institute of Cancer Research: Royal Cancer Hospital, a  
> charitable Company Limited by Guarantee, Registered in England  
> under Company No. 534147 with its Registered Office at 123 Old  
> Brompton Road, London SW7 3RP.
> This e-mail message is confidential and for use by the add...{{dropped}}

More information about the Bioconductor mailing list