[R] daisy(): space allocation issue

Gavin Simpson gavin.simpson at ucl.ac.uk
Thu Aug 26 17:46:56 CEST 2010


On Thu, 2010-08-26 at 07:35 -0700, abanero wrote:
> Hi,
> 
> I'm trying to apply the function daisy() to a data.frame 10000x10 but I have
> not enough space (error message: cannot allocate vector of length
> 1476173280).
> 
> I didn't imagine I was not able to work with a matrix of just 10000
> observations... I have setted in Rgui --max-mem-size=2G (I'm not able to set
> more space..)

You are trying make a 10,000 x 10,000 matrix of dissimilarities.

> How can I solve this issue? Separating observations depending on some rules?

Get/use a machine with more RAM?

I doubt separating observations into chunks and doing the dissimilarity
computations on those chunks then recombining will work as the end
result will still be the 10k x 10k matrix. (If that is what you meant.)

What do you want to do with the dissimilarities? If clustering, try the
clara() function in the same package (cluster) as daisy. But then you'd
need to work out whether clustering such a large number of observations
is a useful activity...

If something else, perhaps let us know what you want to do with the
dissimilarities or what you are trying to achieve as there may be other
things that you can do instead.

> thanks

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list