[R] Colouring hclust() trees

Richard A. O'Keefe ok at cs.otago.ac.nz
Tue May 11 05:24:15 CEST 2004


I asked about putting some kind of coloured rug under a dendrogram.

Thomas Petzoldt <petzoldt at rcs.urz.tu-dresden.de> replied:
	One possibility is to extract the coordinates used by the dendrogram 
	using par("usr") ...

Er, the documentation for par("usr") says
    'usr' A vector of the form 'c(x1, x2, y1, y2)' giving the extremes
          of the user coordinates of the plotting region.  When a
          logarithmic scale is in use (i.e., 'par("xlog")' is true, see
          below), then the x-limits will be '10 ^ par("usr")[1:2]'. 
          Similarly for the y-axis.
But I _know_ the (logical) coordinates of the plotting region; what I need
is the coordinates of the leaves of the dendrogram.

	but as a global alternative in cases like this (many cases and
	known number of classes), I would suggest a different cluster
	alorithm, e.g. ?kmeans.

That doesn't really help, amongst other things because kmeans is not
a hierarchical algorithm.  I *DON'T* know the true number of classes.
I know how many classes the person who collected the data thinks there
are, and I don't need to do any clustering to find them, he gave me a
simple rule.  What I want to know is how many clusters there OUGHT to be
and how similar these clusters are to the ones he thought there were.
>From poking around, the "right" number of clusters is somewhere between
2 and 6.  (For the record, I _have_ tried kmeans and I've tabulated the
kmeans groups against the prespecified groups.)

	If you want to get a visual idea you may try to apply an
	ordination method (e.g. princomp or isoMDS the latter from
	package MASS) and color the objects according to their class
	found by kmeans.
	
I had already done that (using the prespecified classes, not classes found
by kmeans).  But it didn't solve my present problem, which was overlaying
the *prespecified* classes onto a dendrogram.

Two other people gave me answers that are spot on.
Unfortunately, I've now lost their messages, so I can't name them.

Suggestion 1:  use the RowSideColors (or ColSideColors) argument of heatmap().
This gives me two dendrograms (and I can suppress one if I want) and a heat
image of the data, and all things considered, it's *better* than what I wanted.
(I was aware of heatmap, but I'd failed to notice the relevance, or even the
existence, of the ???SideColors arguments.)  In this particular case, the
graph _beautifully_ displays what I want it to display.

Suggestion 2:  use the draw.clust function from the maptree packages.
I have now installed this package (which R makes *so* easy) and it does
exactly what I asked for.

Both of these approaches work with any dendrogram.

I'm beginning to suspect that if something isn't already available in R,
I'll never be able to imagine a need for it.  But then I'm a bear of
very little brain...




More information about the R-help mailing list