[R] efficient equivalent to read.csv / write.csv

Gabor Grothendieck ggrothendieck at gmail.com
Wed Sep 29 02:03:10 CEST 2010


On Tue, Sep 28, 2010 at 1:24 PM, statquant2 <statquant at gmail.com> wrote:
>
> Hi, after testing
> R) system.time(read.csv("myfile.csv"))
>   user  system elapsed
>  1.126   0.038   1.177
>
> R) system.time(read.csv.sql("myfile.csv"))
>   user  system elapsed
>  1.405   0.025   1.439
> Warning messages:
> 1: closing unused connection 4 ()
> 2: closing unused connection 3 ()
>
> It seems that the function is less efficient that the base one ... so ...

The benefit comes with larger files.  With small files there is not
much point in speeding it up since the absolute time is already small.

Suggest you look at the benchmarks on the sqldf home page where a
couple of users benchmarked larger files.   Since sqldf was intended
for convenience and not really performance I was surprised as anyone
when several users independently noticed that sqldf ran several times
faster than unoptimized R code.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list