[R] questioin about cluster in R

TEMPL Matthias Matthias.Templ at statistik.gv.at
Wed Apr 19 09:11:42 CEST 2006


Hello,


> -----Ursprüngliche Nachricht-----
> Von: r-help-bounces at stat.math.ethz.ch [mailto:r-help-
> bounces at stat.math.ethz.ch] Im Auftrag von Jane Ren
> Gesendet: Dienstag, 18. April 2006 21:33
> An: R-help at stat.math.ethz.ch
> Betreff: [R] questioin about cluster in R
> 
> Hi,All.Sorry for the group mail.
> I recently met a question and I have struggled on that for a while but
> failed to found the solution.
> I have a distance matrix as below.
> 
> ---
> 0    35    33    9    36
> 35    0    10    32    51
> 33    10    0    30    49
> 9    32    30    0    35
> 36    51    49    35    0
> -------------------
> I want to do cluster with average method.
> ----
> rown<-c("A", "B", "C", "D", "E")
> mydistMatrix <- read.table("D:\\5.distance",row.names = rown)
> 
> mydistObj<-as.dist(mydistMatrix, diag = FALSE, upper = FALSE)
> 
> mycluster <- hclust(mydistObj,method="average")
> 
> bmp(filename = " D:\\5_ave.bmp")
> plot(mycluster,hang=-1)
> 
> dev.off()
> ---
> The result is something like
> 
>         |
>     20|
>         |       _______________
>     15|       |                             |
>         |       |                             |
>     10|       |                 -------------------
>         |       |                 |  (intersection)      |
>     5  |       |           -------                        |
>         |       |          |          |                  ----------
>     0  |       |          |          |                 |              |
> 
>                 E        A       D                B            C
> 
> Then I want to set a threshold to cluster them. Say 5.

## Make an executable example (!)...
## threshold 1
set.seed(123)
a <- dist(rnorm(100))
ah <- hclust(a)
cutree(ah, h=1)

> But I don't know when A-D  distance is larger than 5 or not.

## for 70-44 distance:
d <- ah$height[which(abs(ah$merge) == c(70,44))[1]]
d > 1   ## FALSE
## for 91-95:
d <- ah$height[which(ah$merge == c(91,95))[1]]
d > 1   ## TRUE

I don't know if this is exactly what you want, but probably this helps a little bit.

Best,
Matthias


> I can draw a line to see whether A-D distance is larger than 5 or but.
> When
> when the dataset is large, it is difficult to tell.
> 
> So I wonder whether there is a way in R to display the distance value at
> the
> intersection so that we can see the exact value of it.
> or there is way to show or save the distance matrix after the average
> algorithm.
> Thanks a  lots!
> Focus
> 
> _________________________________________________________________
> Don't just search. Find. Check out the new MSN Search!




More information about the R-help mailing list