[R] Creating cluster with dbscan from bigmemory

Tiago Cunha tiagodscunha at gmail.com
Fri Mar 8 12:47:50 CET 2013


I am trying to create a very big matrix with big.matrix from package
big.memory in order to apply dbscan afterwards.

I have two problems:

I need to create 4 matrices with 120000 rows X 120000 columns. I have
tested various R packages for big data (particularly bigmemory and
ff). Since ff cannot create matrices bigger than 45000 x 45000
(aproximatly) due to value .Machine$integer.max, I tried package
bigmemory.

I tried to create a matrix and it indeed created file .bin with size
107GB, but failed to create the .desc file also necessary.
I would like to know if this is not possible dut to the size or any
other technical reason.

The second problem is to use these types of matrix (as dissimilarity
matrix)  in dbscan, since I tried to pass them as argument but it
cannot convert it do data.frame nor matrix dut to unavailability of
the necessary amount of RAM (obviously).

If anyone has ever dealt with one or both of these problems, please
let me know how you achieved it.

Thanks.

Tiago Cunha

-- 
Tiago Daniel Sá Cunha
Faculdade de Engenharia da Universidade do Porto
Skype: tiagodscunha
Facebook: http://www.facebook.com/TiagoDSCunha
Email alternativo 1: ei08142 at fe.up.pt
Email alternativo 2: tiagodscunha at hotmail.com



More information about the R-help mailing list