[R] R Hierarchical clustering leaf node

Friedrich.Leisch@tuwien.ac.at Friedrich.Leisch at tuwien.ac.at
Fri Dec 2 09:16:14 CET 2005


>>>>> On Thu, 01 Dec 2005 11:34:01 -0600,
>>>>> Qunfeng  (Q) wrote:

  > Hello,
  > I am new to the R package. After I use R to perform the hierarchical 
  > clustering,  I am only interested in retrieving the leaf nodes that share 
  > the last common ancestors. As illustrated below, I'd like to retrieve (B, 
  > C) as a cluster and then (D, E) as another cluster.    Any chance to do 
  > this in R?  Thanks! BTW, I just subscribed to this list (not sure if the 
  > subscription is succeeded), please copy your anser to my personal email 
  > (qfdong at iastate.edu) -- Qunfeng

Knowing what the internal structure of an hclust object is makes it
actually quite easy for groups of two (getting triplets or higher
would require a little bit more code):

As an example we can use

R> set.seed(1)
R> x=rnorm(5)
R> h=hclust(dist(x))
R> str(as.dendrogram(h))
--[dendrogram w/ 2 branches and 5 members at h = 2.43]
  |--leaf 4
  `--[dendrogram w/ 2 branches and 4 members at h = 1.17]
     |--[dendrogram w/ 2 branches and 2 members at h = 0.146]
     |  |--leaf 2
     |  `--leaf 5
     `--[dendrogram w/ 2 branches and 2 members at h = 0.209]
        |--leaf 1
        `--leaf 3

The key is the "merge" element of the reurn object, from that cou can
extract the two pairs by

R> -h$merge[apply(h$merge,1,function(x) all(x<0)),]

     [,1] [,2]
[1,]    2    5
[2,]    1    3

HTH,

-- 
-------------------------------------------------------------------
                        Friedrich Leisch 
Institut für Statistik                     Tel: (+43 1) 58801 10715
Technische Universität Wien                Fax: (+43 1) 58801 10798
Wiedner Hauptstraße 8-10/1071
A-1040 Wien, Austria             http://www.ci.tuwien.ac.at/~leisch




More information about the R-help mailing list