[BioC] Hier.Clustering: group size effect

Kevin R. Coombes krc at mdacc.tmc.edu
Fri Feb 3 15:47:57 CET 2006


If you already know the groups, then what's the point of doing 
clustering?  More precisely, what biological question do you think you 
are answering with this method?
	Kevin

Heike Pospisil wrote:
> Hello,
> 
> I have a question concerning hierarchical clustering and the effect of group sizes.
> 
> I would like to select genes that are differentially expressed between group A 
> and group B. Afterwards, I wish to cluster the samples by these genes. In 
> principle, it works fine, but I have a problem if the group sizes are 
> significantly unequal. One example is as e.g.:
> group A: 53 samples
> group B: 12 samples
> The resulting clustering brings group B together, but it is not clearly 
> separated from group A. Then again, if I take 12 samples from group A randomly 
> (to get equal group sizes), the clustering is nearly perfect.
> 
> I use hclust(dist(t(exprs(sub)),method="euclidean"),method="complete") 
> (ncol(sub) = groupA+groupB and nrow(sub) = number of sign.genes) and tried other 
> distance measures, but without improvement.
> 
> Does anybody have a hint which clustering algorithm should be prefered for such 
> unequal group sizes?
> 
> Thanks in advance and best wishes,
> Heike



More information about the Bioconductor mailing list