[BioC] Correct use of a distance measure when clustering gene expression data

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Thu Sep 2 17:38:05 CEST 2004


Hi

I have two different data sets, both time-courses.  One uses a common
reference for the Cy3 channel, the other performs direct comparisons
between treated/untreated samples at each time-point.  In both cases the
actual data is log2(Cy5/Cy3).

After a bit of thought, I've come to the conclusion that as a distance
measure for the first dataset I will use "1 - pearson correlation
coefficient".  However, for the second dataset, as we performed direct
comparisons at each time-point, using the correlation coefficient is not
appropriate, so have decided to use euclidean distance.  

Does anyone have experience of what the best distance measure to use is
for time-courses where direct comparisons are made at each time-point?

Cheers
Mick



More information about the Bioconductor mailing list