[R] Fast update of a lot of records in a database?

McGehee, Robert Robert.McGehee at geodecapital.com
Fri May 19 17:48:02 CEST 2006


I was going to suggest sqlUpdate in RODBC, but it looks like that
function also uses the UPDATE command repeated nrow times.

A second strategy that I generally prefer because it does not require
RODBC (as much) and better supports transaction control is to first
create a temporary table with the new columns in it and an identifier
column (perhaps using sqlSave). Then you can join the two tables on the
identifier column and set the old column to the new column en masse
using UPDATE.

Often the bottleneck in doing row-by-row updates is searching for the
index of the id each time, whereas doing the entire join up front and
then updating often speeds this up considerably. In general, if you are
ever doing something that resembles a FOR loop in SQL, there's a faster
way.

Something like this is what I have in mind, although you might need to
tweak for PostgreSQL syntax:

UPDATE table
SET col1 = 
	(SELECT new.col1 FROM table AS old
	JOIN tempTable AS new
	ON old.idCol = new.idCol)

You should also make sure that your table is indexed well to optimize
for updates.

HTH,
Robert

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Duncan Murdoch
Sent: Friday, May 19, 2006 11:17 AM
To: r-help at stat.math.ethz.ch
Subject: [R] Fast update of a lot of records in a database?

We have a PostgreSQL table with about 400000 records in it.  Using 
either RODBC or RdbiPgSQL, what is the fastest way to update one (or a 
few) column(s) in a large collection of records?  Currently we're 
sending sql like

BEGIN
UPDATE table SET col1=value WHERE id=id
(repeated thousands of times for different ids)
COMMIT

and this takes hours to complete.  Surely there must be a quicker way?

Duncan Murdoch

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list