[R] coloring leaves in a hclust or dendrogram plot [solved]

Dylan Beaudette dylan.beaudette at gmail.com
Fri Mar 10 22:36:55 CET 2006


On Thursday 09 March 2006 06:12 pm, Dylan Beaudette wrote:
> Greetings,
>
> I have perused the r-help mailing list archives for an answer to this
> question, without avail.
>
> I would like to color the "leaves" of a dendrogram plot based on a cutoff
> in one of the variables involved in the initial clustering.
>
> My input data is in the form of:
>                   B         K
> Alameda   0.2475770 0.7524230
> Alpine    0.4546784 0.5453216
> Amador    0.6278610 0.3721390
>
> essentially rows labeled by county name, with two variables: percent voted
> for B and percent voted for K. While it is obvious that this is somewhat of
> a contrived example, I intend to use this as a learning device.
>
> Here is the code used to create and plot the dendrogram:
> hc <- hclust(dist(y), "ave")
> dend <- as.dendrogram(hc)
> plot(dend, main="CA 2004 Election Results by County")
>
> An example of the output can be found here:
> http://casoilresource.lawr.ucdavis.edu/drupal/node/206?size=_original
>
>
> I have experimented with the edgePar and nodePar parameters for the
> plot.dendrogram() method, but have not been able to make sense of the
> output.
>
> The basis for setting the colors of the leaves in the dendrogram is a
> simple majority calculation:
>
> reds <- y[y$B > 0.5, ]
> blues <- y[y$K > 0.5, ]
>
> Such that leaves in the tree will be colored based on the membership in
> either of the two above groups.
>
> Is there a resource documenting how this might be accomplished?
>
> Any thoughts or ideas would be greatly appreciated.
>
> Cheers,
>
> Dylan

Replying to my own post...

Discovered the dendapply() function:

reds <<- as.factor(row.names(y[y$B > 0.5, ]))
blues <<- as.factor(row.names(y[y$K > 0.5, ]))

#define a function for coloring and sizing node elements:
colLab <- function(n)
  {
  if(is.leaf(n))
    {
    a <- attributes(n)
    if ( length(which(blues == a$label)) == 1 )
      {
      attr(n, "nodePar") <- c(a$nodePar, list(lab.col = "blue", lab.cex=.7, 
col="blue", cex=pop[n], pch=16 ))
      }
    else
      {
      attr(n, "nodePar") <- c(a$nodePar, list(lab.col = "red", lab.cex=.7, 
col="red", cex=pop[n], pch=16))
      }  
    }
  n
  }

#modfiy dendrogram nodes and re-plot
dend_colored <- dendrapply(dend, colLab)

...which did the trick

http://casoilresource.lawr.ucdavis.edu/drupal/node/210



-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341




More information about the R-help mailing list