[BioC] heatmap.2 and makeContrasts

James W. MacDonald jmacdon at med.umich.edu
Thu Mar 3 17:33:10 CET 2011


Hi Supriya,

On 3/2/2011 10:44 AM, Supriya Munshaw wrote:
> Hi all, I had 2 questions for you reg. using R and Bioconductor.
>
> Question 1: I'm using heatmap.2 to make a heatmap for my top
> differentially expressed genes. I also create a dendogram for my
> columns that clusters by sample. However, is there a way to modify
> these dendograms? For example, if you look at the color coding in the
> attached heatmap, I have clustered by 2 regions. But if you look
> closely, there is no reason that the dendogram can't be flipped so
> that the green sections align i.e. the first blue section from the
> left can be flipped with the second green section from the left which
> would keep the same information but provide a better visual
> representation of the clustering. Does anyone know how I can do
> this?
>

I don't think it is easily done. You might be able to hack at the 
hclust() code or output to give what you want, but it won't be via a 
simple argument to hclust().


> Question 2:
>
> My phenotype data file looks like this
>
> Patient
>
> Disease State
>
> Tissue
>
> A
>
> D
>
> T1
>
> A
>
> D
>
> T2
>
> B
>
> D
>
> T1
>
> B
>
> D
>
> T2
>
> C
>
> N
>
> T1
>
> C
>
> N
>
> T2
>
> D
>
> N
>
> T1
>
> D
>
> N
>
> T2
>
>
> So the first comparison I want to make is between disease and non
> disease in all tissues. I can do that in 2 ways:
>
> Option 1: desMat<- model.matrix(~0+ DiseaseState) colnames(desMat)<-
> levels(DiseaseState) contMat<- makeContrasts(D-N, levels=
> colnames(desMat)) # I'm assuming this groups all disease states in
> one group and all non disease states in another, without regard to
> patient, treating each sample independently, which is fine.
>
> Option 2: Combine<-factor(paste(DiseaseState,Tissue,sep=".")   #So
> now my states are D.T1, D.T2, N.T1, N.T2 desMat<- model.matrix(~0+
> Combine) colnames(desMat)<- levels(Combine) contMat<-
> makeContrasts(((D.T1+D.T2)/2)- ((N.T1+N.T2)/2), levels=
> colnames(desMat))
>
> Shouldn't option 1 and 2 give me the same answer? In my case, it does
> not and I'm not sure I understand why.

No it should not. You are asking two subtly different questions in each 
case. In option 1 you are ignoring any differences between the tissues 
and asking if there is a difference between disease states. In option 2 
you are accounting for the tissue differences and then asking if there 
is a difference between the disease states.

This comes from how the denominator of the t-statistic is constructed. 
Note that in simple terms the denominator is an average of the 
variability within groups being compared. In option 1, you are computing 
the variability within the diseased group and normal group separately 
and then averaging them. In option 2 you are computing variability 
within the D.T1, D.T2, N.T1, N.T2 groups separately and then averaging.

So if the tissues are quite different in expression, but are consistent 
within each disease state/tissue type, then you will tend to get 
significance in option2 but not option 1. As an example:

D.T1 = c(4.5,4.3,4.7,4.2)
D.T2 = c(6.4,5.8,6.0,5.8)
N.T1 = c(6.5,6.3,6.1,6.6)
N.T2 = c(7.3,7.2,7.0,7.5)

Here you can see that the within-group variability is very small, but if 
you pool the diseased and normal samples, the variability will increase 
quite a bit, and may well no longer be significant.

Best,

Jim




>
> I would really appreciate any help. Thank you!
>
>
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list