[R] efficient equivalent to read.csv / write.csv

Gabor Grothendieck ggrothendieck at gmail.com
Wed Sep 29 05:43:39 CEST 2010


On Tue, Sep 28, 2010 at 5:02 PM, statquant2 <statquant at gmail.com> wrote:
>
> Hello all,
> the test I provided was just to pinpoint that for loading once a big csv

A file that can be read in under 2 seconds is not big.

> file with read.csv was quicker than read.csv.sql... I have already
> "optimized" my calls to read.csv for my particular problem, but is a simple
> call to read.csv was quicker than read.csv.sql I doubt that specifying args
> would invert the reult a lot...
>
> May be I should outline my problem :
>
> I am working on a powerful machine with 32Go or 64Go of RAM, so loading file
> and keeping them in memory is not really an issue.
> Those files (let's say 100) are shared by many and are flat csv files (this
> to say that modify them is out of question).
> Those files have lots of rows and between 10 and 20 colums, string and
> numeric...
>
> I basically need to be able to load these files to quicker possible and then
> I will keep those data frame in memory...
> So :
> Should I write my own C++ function and call it from R ?
> Or is there a R way of improving drastically read.csv ?
>
> Thanks a lot

So you have a bunch of small files and want to read them fast.  Are
they always the same or are they changing or a combination of the two?
  If they are the same or if many of them are the same then read those
once, save() them as RData files and load() them when you want them.
The load will be very fast.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list