[BioC] How to do clustering

Sean Davis sdavis2 at mail.nih.gov
Mon Jun 11 12:15:00 CEST 2007


ssls sddd wrote:
> Dear Dr.Thomas Girke,
> 
> Thank you very much for the info.
> 
> I tried mydist <- as.dist(1-cor(t(y), method="pearson")) on my data but
> it failed. My 'y' consists of 238000 observations (rows)  and 49 samples
> (columns) and R said:
> 
> error in cor(t(x), method = "pearson") : allocMatrix: too many elements
> 
> Do you think I can make this work out in another way?

For folks relatively new to R and Bioconductor, it is worthwhile the
help pages for ALL new commands used.  In this case, the help page for
cor() states that it will compute the correlation between all COLUMNS of
the matrix, if given a matrix.  You have a matrix with 49 columns and
238,000 rows.  If you were to run cor() on that matrix, it would produce
a matrix of size 49x49 containing all pairwise correlations between
samples.  However, in this case, a transpose is applied first, so R is
going to try to compute a 238,000x238,000 matrix of correlation
coefficients.  I'm assuming that is not what you want and that you
really want to cluster the samples.  Drop the t() and all will be well.

Sean



More information about the Bioconductor mailing list