[R] hierarhical cluster analysis of groups of vectors

S Ellison S.Ellison at lgc.co.uk
Tue May 29 15:47:40 CEST 2007


If you want to _test_ for differences, ANOVA applied to on the (typically) first principal component scores for each object would  give a fairly quick indication of whether there was a case to answer (though scaling is an issue to be aware of; a low-variance variable might differ strongly between groups yet be masked by a larger variance variable wiht no group association unless you get the scaling right for the circumstances).

If you just want to cluster the 10 groups, I suspect it might be simplest to "average" (where "average" implies some consistent summary statistic for each variable) your starting vectors, _before_ playing about with your distance matrix; after all, it is the inter-"mean" distances you are after, so why not get the "means" in the first place?. Of course, scaling is again an issue if the variables differ in variance...

Steve E

>>> Anders Malmendal <anders at chem.au.dk> 29/05/2007 10:15:23 >>>
I want to do hierarchical cluster analysis to compare 10 groups of 
vectors with five vectors in each group (i.e. I want to make a dendogram 
showing the clustering of the different groups). I've looked into using 
dist and hclust, but cannot see how to compare the different groups 
instead of the individual vectors. I am thankful for any help.

R-help at stat.math.ethz.ch mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

This email and any attachments are confidential. Any use, co...{{dropped}}

More information about the R-help mailing list