[R] efficient equivalent to read.csv / write.csv

Gabor Grothendieck ggrothendieck at gmail.com
Sun Sep 26 18:25:59 CEST 2010


On Sun, Sep 26, 2010 at 8:38 AM, statquant2 <statquant at gmail.com> wrote:
>
> Hello everyone,
> I currently run R code that have to read 100 or more large csv files (>= 100
> Mo), and usually write csv too.
> My collegues and I like R very much but are a little bit ashtonished by how
> slow those functions are. We have looked on every argument of those
> functions and if specifying some parameters help a bit, this is still too
> slow.
> I am sure a lot of people have the same problem so I thought one of you
> would know a trick or a package that would help speeding this up a lot.
>
> (we work on LINUX Red Hat R 2.10.0 but I guess this is of no use for this
> pb)
>
> Thanks for reading this.
> Have a nice week end

You could try read.csv.sql in the sqldf package:

http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql

See ?read.csv.sql in sqldf.  It uses RSQLite and SQLite to read the
file into an sqlite database (which it sets up for you) completely
bypassing R and from there grabs it into R removing the database it
created at the end.

There are also CSVREAD and CSVWRITE sql functions in the H2 database
which is also supported by sqldf although I have never checked their
speed:
http://code.google.com/p/sqldf/#10.__What_are_some_of_the_differences_between_using_SQLite_and_H

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list