[R] hclust/dendrogram merging

Peter Langfelder peter.langfelder at gmail.com
Mon Sep 16 21:35:25 CEST 2013


Joshua,

I'm not sure I understand your aim correctly, but if I do, here's my
advice: If you are able to find the clusters according to rows or
columns using clustering, you must be using some kind of a distance
matrix that encodes whether two antibodies should be in one bin for
rows, and a similar matrix for the columns. To get a clustering that
represents only bins that occur in both directions, you can
appropriately combine the two matrices into a single matrix. For
example, if the distance matrix is zero if the antibodies go together
and 1 otherwise, you can add the two matrices into a single matrix,
then cluster the antibodies using the combined matrix using hclust
(with complete linkage, if I understand it correctly), then use
cutree() with cut height equal say 0.5.

HTH,

Peter

On Sun, Sep 15, 2013 at 11:33 PM, Joshua Eckman <josheckman at hotmail.com> wrote:
> I am working with protein blocking assays and the end result is a 2D matrix describing which antibodies block the binding of other antibodies to the target antigen.I need to group the antibodies together into "bins" based on their combined profiles in both the row and column direction.I am able to group the blocking profiles of rows vs rows, or columns vs columns, using clustering.  The end results could look something like this:
>>col_bins         binAb1   1Ab2   2Ab3   2Ab4   2Ab5   3Ab6   4Ab7   5Ab8   5Ab9   6
> In this case the "bin" values are just to describe they have similar blocking profiles - so Ab2, Ab3, Ab4 have the same blocking profile, as do Ab7 and Ab8.
> Looking at the row profiles
>>row_bins       binAb1   1Ab2   2Ab3   3Ab4   3Ab5   4Ab6   5Ab7   5  Ab8   6Ab10  7
> The important end result, where I am stuck, is how to combine this with the row direction and only report those that are represented in both directions AND group together in both directions.  It is possible that some Abs will not be represented in both directions.  The "bin" values of row_bins and col_bins are also not important, just the relationship between Abs by name that belong in the same bin, in both directions.
> In other words, a combined bins report would look something like this:
>        binAb1  A Ab3  BAb4  BAb5  C
> I made this visually because it is clear that these are the only groupings that are maintained in both directions.  But real data sets are much bigger, so I need some form of automation.
> Any ideas on how do this with matrix, dendograms or clustering functions?
> Thank you,
> josh
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list