[R] clustering problem

Karin Lagesen karin.lagesen at medisin.uio.no
Wed Feb 20 11:10:30 CET 2008


First I just want to say thanks for all the help I've had from the
list so far..)

I now have what I think is a clustering problem. I have lots of
objects which I have measured a dissimilarity between. Now, this list
only has one entry per pair, so it is not symmetrical.

Example input:

NameA   NameB   Dist
189_1C2 189_1C1 0
189_1C3 189_1C1 0.017
189_1C3 189_1C2 0.017
189_1C4 189_1C1 0
189_1C4 189_1C2 0
189_1C4 189_1C3 0.017
189_1C5 189_1C1 0.05
189_1C5 189_1C2 0.05
189_1C5 189_1C3 0.067
189_1C5 189_1C4 0.05
189_1C6 189_1C1 0.05
189_1C6 189_1C2 0.05
189_1C6 189_1C3 0.067
189_1C6 189_1C4 0.05
189_1C6 189_1C5 0


The distance measure is 0 if identical, and then increases with
increasing dissimilarity up till 1.

What I would like to get from these data is a hierarchical clustering
graph. In this example I would then group

189_1C2 189_1C1 189_1C4,

189_1C6 189_1C5,

and 189_1C3 off with itself.

The distances between the groups should be the mean distances between
the objects within each group (I think).

I have looked at hclust and it seems like it should be able to do what
I want. However, I am unsure of how to use it to get what I am looking
for.

Thankyou in advance for your help!

Karin
-- 
Karin Lagesen, PhD student
karin.lagesen at medisin.uio.no
http://folk.uio.no/karinlag



More information about the R-help mailing list