[BioC] hierarchical clustering of chips

Tue Apr 27 17:18:20 CEST 2004

Hello,

I'm interested to read about your experience using hierarchical clustering and dendogram visualization (via hclust) for several (cross chip) normalized chips. 

I'm not running the clustering on all the genes on the chip. I select those genes that are significantly effected by the factors of interest (e.g. dose+time) via a linear model and anova. What pre-selection do you use (if any)?

I then take the mean or median of technical replicates, to reduce the number of leafs of the tree.

I've realized that the outcome of the clustering is not just (strongly) dependant on the selection of the input genes, but also on whether intensities or ratios (treated versus control) are used for the clustering. When do you use intensities and when ratios?

Last, some of the hierachrical clusterings I've been working with seem to make more sense when not using intensities (or ratios) directly, but calculating the correlation matrix between all chips (or treatments), and use this as a distance matrix (as.dist(1 - cor(intensity.matrix))). In this case Sperman correlations seems to be more reasonalbe than Pearson.

Maybe you want to comment on this or initiate a small discussion.

	kind regards,

	Arne

--
Arne Muller, Ph.D.
Toxicogenomics, Aventis Pharma
arne dot muller domain=aventis com