[R] Reading large sparse arff files into R

andy1234 listanand at gmail.com
Sat Dec 31 22:09:50 CET 2011


I am trying to read in a large and highly sparse ARFF file into R which was
produced by WEKA. However the package 'RWeka' just chokes on this file. The
data set has about 40k observations and about 20k dimensions. Even after 1hr
read.arff method of RWeka is still trying to read in the file, whereas WEKA
is able to read it in in less than 20seconds. 

What are my options at this point? I have looked at 'foreign' package as
well but that has it's own quirks (it can't read sparse arff files, which
makes it impossible to use on the scale of my data). 

And is there a way to read in the ARFF file directly into a sparse object in
R (like Matrix for example) instead of a data frame?


View this message in context: http://r.789695.n4.nabble.com/Reading-large-sparse-arff-files-into-R-tp4249409p4249409.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list