[R] 50993 point distance matrix, too big to as.matrix, looking for another way to calculate point-level summary

Romain Francois romain.francois at dbmail.com
Sat Jun 27 09:32:47 CEST 2009


Hi,

If you are only interested in row means, you can work the distance 
matrix at the c level.

You might like to adapt this post:
http://tolstoy.newcastle.edu.au/R/e6/devel/09/04/1378.html

Romain

On 06/26/2009 09:40 PM, leif olson wrote:
> Hello, Im working on a 50933 point count bird abundance dataset. I've
> succeeded in calculating a distance matrix for this entire set, but I don't
> have sufficient memory to convert this to a matrix, as below...
> abun.dist<- dist(abun.mat[1:50993,1:235)
> test<- rowMeans(as.matrix(abun.dist))
> Error in matrix(0, size, size) : too many elements specified
>
> ive been able to run a hclust() clustering procedure, due to the fact that
> hclust() makes a call to fortran code, but id like to be able to generate a
> calinski index for each of the clusters to assess the validity.
> Unfortunately, all the validation routines I have found are all native R
> code, and usually call as.matrix, resulting in the same error i receive
> above.
> What I'd like to figure out is how to just go through, one point at a time,
> and calculate the values i need. But I've been unable to come up with code
> to call the correct positions in the dist vector, can anyone suggest some
> code that might do this? Thanks...
>
> ...leif
>    


-- 
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr




More information about the R-help mailing list