[R] efficient equivalent to read.csv / write.csv

David Scott d.scott at auckland.ac.nz
Tue Sep 28 22:16:31 CEST 2010


On 29/09/2010 6:24 a.m., statquant2 wrote:
>
> Hi, after testing
> R) system.time(read.csv("myfile.csv"))
>     user  system elapsed
>    1.126   0.038   1.177
>
> R) system.time(read.csv.sql("myfile.csv"))
>     user  system elapsed
>    1.405   0.025   1.439
> Warning messages:
> 1: closing unused connection 4 ()
> 2: closing unused connection 3 ()
>
> It seems that the function is less efficient that the base one ... so ...

I presume you have had a good look at the R Data Import/Export manual?

It does there warn of inefficiency with read.table (hence also read.csv) 
and suggest more direct use of scan which in your case might be via 
connections and readLines and writeLines.

If that doesn't work, why not go to a database. Use RODBC or some such 
to read and write tables in the database. There are many options for 
databases to use (MySQL works for me). You can easily read data in and 
out of the database in .csv format. If the .csv files are similar there 
shouldn't be too much overhead in defining table formats for the database.


David Scott

-- 
_________________________________________________________________
David Scott	Department of Statistics
		The University of Auckland, PB 92019
		Auckland 1142,    NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:	d.scott at auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics



More information about the R-help mailing list