[R] cluster/distance large matrix

Bart Thijs bart.thijs at econ.kuleuven.be
Thu Feb 11 14:22:49 CET 2010


Hi all,

I've stumbled upon some memory limitations for the analysis that I want to
run.

I've a matrix of distances between 38000 objects. These distances were
calculated outside of R. 
I want to cluster these objects.

For smaller sets (egn=100) this is how I proceed:
A<-matrix(scan(file, n=100*100),100,100, byrow=TRUE)
ad<-as.dist(A)
ahc<-hclust(ad,method="ward",members=NULL)
....

However if I try this with the real dataset I end up with memory problems.
I've the 64bit version of R installed on a machine with 40Gb RAM (Windows
2003 64bit version).

I'm thinking about using only the lower triangle of the matrix but I can't
create a distance object for the clustering from the lower.tri

Can someone help me with a suggestion for which way to go?

Best Regards
Bart Thijs
-- 
View this message in context: http://n4.nabble.com/cluster-distance-large-matrix-tp1477237p1477237.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list