[R] Fast update of a lot of records in a database?

Steve Miller steve.miller at jhu.edu
Sat May 20 14:15:39 CEST 2006


Though I'd question the prudence of doing mass database changes through R: 

Other things equal, single updates in a loop should be slower than a
correlated update due to the performance costs of bind and execute (and
perhaps even parse). Also, updates are generally slower than inserts because
of rollback and logging, and the potential of introducing chained blocks to
hold the updates. The correlated temporary table update should be faster
than the original, provided indexing is appropriate (i.e an index on
bigtable.id).

Any chance you could create a new bigtable as a join select of old bigtable
and updates? This would be faster still. 

Steve Miller

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of hadley wickham
Sent: Friday, May 19, 2006 2:20 PM
To: Duncan Murdoch
Cc: Robert.McGehee at geodecapital.com; r-help at stat.math.ethz.ch;
br44114 at gmail.com
Subject: Re: [R] Fast update of a lot of records in a database?

> put the updates into a temporary table called updates
>
> UPDATE bigtable AS a
>    FROM updates AS b
>    WHERE a.id = b.id
>    SET a.col1 = b.col1

I don't think this will be any faster - why would creating a new table
be faster than updating existing rows?

I've never had a problem with using large numbers of SQL update
statements (in the order of hundreds of thousands) to update a table
and having them complete in a reasonable time (a few minutes).  How
heavily indexed is the field you are updating?  You may be able to get
some speed improvements by turning off indices before the update and
back on again afterwards (highly dependent on your database system
though).

I would strongly suspect your bottleneck lies elsewhere (eg. when
generating the statements in R, or using ODBC to send them)

Hadley

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html



More information about the R-help mailing list