[BioC] dendrograms on heatmap.2 (gplots)

Gavin Koh gavin.koh at gmail.com
Sat May 28 19:53:49 CEST 2011


Dear Steve,

Yes, I expect that in preserving the order in which I have the sample
currently, the branches will cross.
You are right: it will be clearer to cluster by k-means then use
ColSideColors to colour the leaves than to try to draw a dendrogram
with criss-crossing branches. Thanks for helping me thinking this
through.

Gavin.

On 28 May 2011 18:20, Steve Lianoglou <mailinglist.honeypot at gmail.com> wrote:
> Hi Gavin,
>
> On Sat, May 28, 2011 at 11:06 AM, Gavin Koh <gavin.koh at gmail.com> wrote:
>> Dear Steve, I have healthy controls and patients, so two groups.
>> k-means misclassifies a few study subjects, but by and large,
>> redrawing the dendrogram while preserving the ordering is not going to
>> serious mess things up.
>
> Sorry if my post came across in the wrong way -- I'm not trying to
> imply that you are trying to show something that isn't true, or
> something ... I'm actually not sure how you interpreted my email,
> because I'm not sure what you're trying to say in your reply, so let
> my try another way :-)
>
> I guess my point is that: yes, you have two groups when you condition
> group assignment based on a state we call "healthy" and "affected" (or
> whatever you call them here).
>
> If you are asking to group your patients in a different way -- this
> time using your gene expression profiles -- it's not totally unusual
> for things to change a bit.
>
> So, again, I'm not trying to lecture here, but this is the way I
> understand it. If I'm wrong, feel free to correct me:
>
> The distances we "walk along" the arms/branches of the dendrogram say
> something about the distance between the "things" they are connecting.
> If you didn't change any params in your heatmap call, the default
> distance measure between your vectors is calculated by its euclidean
> distance, and that just is what it is. The dendrogram is then drawn to
> respect those distances. If you move things around, then you are
> saying something different about those distances, right?
>
> In this context, I'm confused about your point when you say "redrawing
> the dendrogram while preserving the ordering is not going to serious
> mess things up" -- what ordering do you expect to be preserved ... is
> it the columns of the matrix that you passed in? If you don't want to
> move those columns around, then  do you want the branches of the tree
> to criss-cross or something?
>
> The way I see it, you are kind of stuck if you intend to draw a
> dendrogram at all.
>
> So -- how can we move things around in a natural way?
>
> Maybe you can choose a different distance measure?
> Maybe you can normalize your data in a different way?
> Maybe you can plot a subset of genes -- maybe those with the highest
> variance across all your data, which might result in new distances
> calculated, and a different drawing of the branches on the tree.
>
> You could always pass in your own dendrogram structure to the heatmap
> and "arbitrarily" calculate distances so that the tree  draws as you
> want, but I don't think that's something you'd want to do anyway.
>
> Another approach to show "likeness" between expression profiles is to
> not focus on the dendrogram lining up "just so", but to rather add a
> list of colors to the examples (columns) of your data by using the
> "ColSideColors" parameter. Say the first 10 columns of your matrix are
> from the 10 controls, and the last 10 are from the affecteds. You can
> do:
>
> R> heatmap.2(my.data, ..., ColSideColColors=c(rep('blue', 10), rep('red', 10)))
>
> If, as you say, the expression profiles are *mostly* similar, you'll
> see that, by and large, the blue experiments will be "chunked" w/
> blue, and the red expts are chunked with the red, which might show the
> same point you're trying to make with the dendrogram.
>
> HTH,
> -steve
>
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>



-- 
Hofstadter's Law: It always takes longer than you expect, even when
you take into account Hofstadter's Law.
—Douglas Hofstadter (in Gödel, Escher, Bach, 1979)



More information about the Bioconductor mailing list