[BioC] Extracting dendogram information from Heatmaps

'Thomas Girke' thomas.girke at ucr.edu
Fri Dec 14 06:51:42 CET 2007


You may have mixed up the orientation of your matrix. Does 
length(myclust$labels) correspond to the number of rows in
myma?

Thomas 

On Fri 12/14/07 00:05, alison waller wrote:
> Great, I passed genes names to the matrix through rownames(myma).
> 
> Your example below is great!
> 
> However, I would like to use heatmap.2 and am having trouble specifying the
> Row clustering.  I made Rowv and Colv,
> Rowv=as.dendrogram(myclust),Colv<-as.dendrogram(hclust(dist(t(myma)))). And
> thought I could just create the heatmap as below.
> But it won't let me pass it Rowv or Colv, so the computer has to recalculate
> the clusters everytime I want to change colors or something little (this
> takes a while).  Is there something wrong with my syntax?
> 
> 
> >heatmap.2(myma,Rowv=Rowv,Colv=Colv,col=topo.colors(75),RowSideColors=mycolh
> c42,trace='none',labRow=FALSE,key=T)
> Warning messages:
> 1: gamma cannot be modified on this device 
> 2: Discrepancy: Rowv is FALSE, while dendrogram is `both'. Omitting row
> dendogram. in: heatmap.2(myma, Rowv = Rowv, Colv = Colv, col =
> topo.colors(75),  
> 3: Discrepancy: Colv is FALSE, while dendrogram is `none'. Omitting column
> dendogram. in: heatmap.2(myma, Rowv = Rowv, Colv = Colv, col =
> topo.colors(75),  
> >heatmap.2(myma,Rowv=Rowv,col=topo.colors(75),RowSideColors=mycolhc42,trace=
> 'none',labRow=FALSE,key=T)
> Warning message:
> Discrepancy: Rowv is FALSE, while dendrogram is `column'. Omitting row
> dendogram. in: heatmap.2(myma, Rowv = Rowv, col = topo.colors(75),
> RowSideColors = mycolhc42,  
> > ?heatmap.2
> -----Original Message-----
> From: 'Thomas Girke' [mailto:thomas.girke at ucr.edu] 
> Sent: Thursday, December 13, 2007 1:21 PM
> To: alison waller
> Cc: 'James W. MacDonald'; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Extracting dendogram information from Heatmaps
> 
> The best way to answer these questions is to subset your data set to
> to a test matrix with only a few rows. This way you can see the labels 
> in the plot and things become intuitive.  
> For example: 
> 	myma <- MA$M[fitp,]
> 	myma <- myma[1:20,]
> 
> Row names can always be assigned by you with 
> 	rownames(myma) <- mynames
> 
> If your data set has a label column, then it would be
> 	rownames(myma) <- myname$label
> 
> To be sure your data set is a matrix, you do:
> 	myma <- as.matrix(myma)
> 
> Continue with hclust ...
> 
> Thomas
> 
> On Thu 12/13/07 12:50, alison waller wrote:
> > Thanks everyone, these are great suggestions.
> > 
> > I had trouble with the identify, as the plot moved when I clicked the
> mouse
> > and I got error messages.  
> > 
> > The cutree worked well - however, I see a matrix which has values
> > corresponding to clusters, but is cluster one the leftmost or rightmost
> > cluster? Ie. how are they ordered.
> > 
> > The $labels method seems the best but my matrix doesn't seem to have
> labels.
> > I made my matrix from the M values from an MAList, is there a way to carry
> > through the gene names?
> > 
> > Myclust<-hclust(dist(MA$M[fitp,])
> > Myclust$labels gives NULL
> > 
> > Thanks again,
> > 
> > alison
> > 
> > 
> > -----Original Message-----
> > From: Thomas Girke [mailto:thomas.girke at ucr.edu] 
> > Sent: Thursday, December 13, 2007 12:00 PM
> > To: James W. MacDonald
> > Cc: alison waller; bioconductor at stat.math.ethz.ch
> > Subject: Re: [BioC] Extracting dendogram information from Heatmaps
> > 
> > Alison,
> > 
> > In addition to James' suggestions, you may want to get familiar how to
> > access the 
> > different data components of the resulting hclust object (e.g. labels,
> > order) and 
> > the cutree() function. If you can't read the labels in the plots, then you
> > can 
> > always extract them in clean text in the corresponding tree order (see
> > below: 
> > hr$labels[hr$order]) from the hclust objects.
> > 
> > Here is a short example to illustrate a possible hclust-heatmap/heatmap.2
> > routine:
> > 
> > # Generate a sample matrix
> > y <- matrix(rnorm(50), 10, 5, dimnames=list(paste("g", 1:10, sep=""),
> > paste("t", 1:5, sep=""))) 
> > 
> > # Cluster rows and columns by correlation distance
> > hr <- hclust(as.dist(1-cor(t(y), method="pearson"))) 
> > hc <- hclust(as.dist(1-cor(y, method="spearman"))) 
> > 
> > # Obtain discrete clusters with cutree
> > mycl <- cutree(hr, h=max(hr$height)/1.5)
> > 
> > # Prints the row labels in the order they appear in the tree.
> > hr$labels[hr$order] .
> > # Prints the row labels and cluster assignments
> > sort(mycl) 
> > 
> > # Some color selection steps
> > mycolhc <- sample(rainbow(256))
> > mycolhc <- mycolhc[as.vector(mycl)]
> > 
> > # Plot the data matrix as heatmap and the cluster results as dendrograms
> > with heatmap or heatmap.2
> > # and show the cutree() results in color bar.
> > heatmap(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc), scale="row",
> > RowSideColors=mycolhc) 
> > 
> > library("gplots") 
> > heatmap.2(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc),
> > col=redgreen(75), scale="row", 
> > ColSideColors=heat.colors(length(hc$labels)), RowSideColors=mycolhc,
> > trace="none", key=T, cellnote=round(t(scale(t(y))),1))
> > 
> > 
> > Best, 
> > Thomas
> > 
> > On Thu 12/13/07 09:58, James W. MacDonald wrote:
> > > Hi Alison,
> > > 
> > > alison waller wrote:
> > > > Hello Everyone,
> > > > 
> > > >  
> > > > 
> > > > I've been using heatmap and heatmap.2 to draw heatmaps for my
> > experiments.  
> > > > 
> > > >  
> > > > 
> > > > I have a heatmap of the M values of 6 arrays for the spots with
> pvalues
> > were
> > > > <0.005 (from eBayes).
> > > > 
> > > > However, I would like to see which spots it has grouped together in
> the
> > row
> > > > dendogram.  Is there a way I can extract the information about the
> spots
> > > > that are clustered together.  I cannot read the row names, and even if
> I
> > > > could I was hoping there would be some way to list the clusters and
> save
> > it
> > > > to a file.
> > > 
> > > There are two ways to do this that I know of. And either can be a pain, 
> > > depending on how big the dendrogram is.
> > > 
> > > Both methods require you to construct your dendrogram first. You can 
> > > then choose the clusters with the mouse. This might be more difficult if
> 
> > > you have some gigantic dendrogram and have ingested too much coffee ;-D.
> > > 
> > > Normally, one would simply do
> > > 
> > > heatmap(mymatrix, otherargs)
> > > 
> > > and accept the default clustering method. However, you can always 
> > > pre-construct the dendrograms and then feed those to heatmap().
> > > 
> > > Rowv <- as.dendrogram(hclust(dist(mymatrix)))
> > > Colv <- as.dendrogram(hclust(dist(t(mymatrix))))
> > > 
> > > heatmap(mymatrix, Rowv=Rowv, Colv=Colv, otherargs)
> > > 
> > > Now if you do something like that, then you can try
> > > 
> > > plot(Rowv)
> > > a.cluster <- identify(Rowv)
> > > 
> > > and then use your mouse to choose the upper left corner of a rectangle 
> > > that encompasses the cluster you are interested in. Here is where the 
> > > size of the dendrogram and the amount of coffee comes in. If the 
> > > dendrogram is really large then identify() may not be able to figure out
> 
> > > what you are trying to select, or may decide you are choosing the upper 
> > > right corner.
> > > 
> > > You can choose as many clusters as you want, and they will be in the 
> > > list a.cluster, in the order you selected.
> > > 
> > > A more programmatic method is to use rect.hclust() and either choose the
> 
> > > height at which to make the cuts, or the number of clusters, etc. Again,
> 
> > > depending on the size of your dendrogram, this may work well or it may 
> > > be painful.
> > > 
> > > Best,
> > > 
> > > Jim
> > > 
> > > 
> > > > 
> > > >  
> > > > 
> > > > Thanks,
> > > > 
> > > >  
> > > > 
> > > > Alison  
> > > > 
> > > >  
> > > > 
> > > > ******************************************
> > > > Alison S. Waller  M.A.Sc.
> > > > Doctoral Candidate
> > > > awaller at chem-eng.utoronto.ca
> > > > 416-978-4222 (lab)
> > > > Department of Chemical Engineering
> > > > Wallberg Building
> > > > 200 College st.
> > > > Toronto, ON
> > > > M5S 3E5
> > > > 
> > > >   
> > > > 
> > > >  
> > > > 
> > > > 
> > > > 	[[alternative HTML version deleted]]
> > > > 
> > > > _______________________________________________
> > > > Bioconductor mailing list
> > > > Bioconductor at stat.math.ethz.ch
> > > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > 
> > > -- 
> > > James W. MacDonald, M.S.
> > > Biostatistician
> > > Affymetrix and cDNA Microarray Core
> > > University of Michigan Cancer Center
> > > 1500 E. Medical Center Drive
> > > 7410 CCGC
> > > Ann Arbor MI 48109
> > > 734-647-5623
> > > 
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> > > 
> > 
> > -- 
> > Thomas Girke
> > Assistant Professor of Bioinformatics
> > Director, IIGB Bioinformatic Facility
> > Center for Plant Cell Biology (CEPCEB)
> > Institute for Integrative Genome Biology (IIGB)
> > Department of Botany and Plant Sciences
> > 1008 Noel T. Keen Hall
> > University of California
> > Riverside, CA 92521
> > 
> > E-mail: thomas.girke at ucr.edu
> > Website: http://faculty.ucr.edu/~tgirke
> > Ph: 951-827-2469
> > Fax: 951-827-4437
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> > 
> 
> -- 
> Thomas Girke
> Assistant Professor of Bioinformatics
> Director, IIGB Bioinformatic Facility
> Center for Plant Cell Biology (CEPCEB)
> Institute for Integrative Genome Biology (IIGB)
> Department of Botany and Plant Sciences
> 1008 Noel T. Keen Hall
> University of California
> Riverside, CA 92521
> 
> E-mail: thomas.girke at ucr.edu
> Website: http://faculty.ucr.edu/~tgirke
> Ph: 951-827-2469
> Fax: 951-827-4437
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Thomas Girke
Assistant Professor of Bioinformatics
Director, IIGB Bioinformatic Facility
Center for Plant Cell Biology (CEPCEB)
Institute for Integrative Genome Biology (IIGB)
Department of Botany and Plant Sciences
1008 Noel T. Keen Hall
University of California
Riverside, CA 92521

E-mail: thomas.girke at ucr.edu
Website: http://faculty.ucr.edu/~tgirke
Ph: 951-827-2469
Fax: 951-827-4437



More information about the Bioconductor mailing list