[R] RODBC sqlQuery insert slow

Michel Lang michellang at gmail.com
Fri Oct 13 17:19:09 CEST 2006


> I am trying to insert a lot of data into a table using windows R (2.3.1)
> and a mysql database via RODBC.
> First I read a file with read.csv and then form sql insert statements
> for each row and execute the insert query one row at a time. See the
> loop below.
> This turns out to be very slow.
> Can anyone please suggest a way to speed it up?

A few weeks ago I had to solve a similar task. Inserting each row turned out 
to be horrible slow due to paste() and the data.frame-indexing. The estimated 
runtime would have been over 3 weeks, so I used MySQLs LOAD DATE INFILE 
syntax to speed things up. You must have FILE_PRIV = 'Y' set in the 
mysql.user-table to use this small hack, and I'm not that sure that this runs 
remotely. It is also assumed that your df has valid column-names.

tmp_filename <- tempfile()

write.table(df, tmp_filename, na = "\\N", row.names = FALSE, col.names = 
FALSE, quote = FALSE, sep = "\t")

query <- paste(
    "LOAD DATA LOCAL INFILE '",  tmp_filename, "'",
    " INTO TABLE ", your_table, " (", toString(names(df)), ");", sep = "")

sqlQuery(channel, query)
unlink(tmp_filename)

The total runtime for the LOAD DATA INFILE querys was something below 5 
minutes, inserting 3e+6 rows with > 200 columns.

Michel Lang



More information about the R-help mailing list