[R] Getting frustrated with RMySQL

Jeffrey Horner jeff.horner at vanderbilt.edu
Tue Oct 14 23:32:25 CEST 2008

Ted Byers wrote on 10/14/2008 02:33 PM:
> Getting the basic stuff to work is trivially simple.  I can connect, and, for
> example, get everything in any given table.
> What I have yet to find is how to deal with parameterized queries or how to
> do a simple insert (but not of a value known at the time the script is
> written - I ultimately want to put my script into a scheduled task, so the
> analysis can be repeated on updated data either daily or weekly).  
> Using "INSERT INTO myTable (a) VALUES (1)" is simple enough, but what if I
> want to insert a sample number (using, e.g. WEEK(sample_date) as a sample
> identifier) along with the rate parameter estimated using fitdistr to fit an
> exponential distribution to a dataset, along with its sd?  If I were using
> Perl or Java, I'd set up the query similar to "INSERT INTO myTable (a,b,c)
> VALUES (?,?,?)", and then use function calls to set each of the query
> parameters.  I am having an aweful time finding the corresponding functions
> in RMySQL.

I've found the best way to parameterize is using R's sprintf function. 
For instance, the following query not only parameterizes the variable 
position, but also the table name:

fields  <- dbGetQuery(con,sprintf("select field,elem_label from %s_meta 
where field='%s'",inp$pnid,inp$field))



> And for the data, the simplest, and most efficient, way to get the data is
> to use a statement like:
> SELECT a,b,c FROM myTable GROUP BY g_id, WEEK(sdate);
> The data is in MySQL, and my analysis needs to be applied independantly to
> each group obtained from a query like this.  It appears I can't use a data
> frame since none of the samples are of the same size (lets say the
> probability of the samples being the same size in indistinguishable from 0). 
> Is it possible to put the resultset from such a query into a list of vectors
> that I can iterate over, passing each vector to fitdistr in turn?  If so,
> how?
> I know I can get this using Perl (by getting each sample individually and
> writing it to a file, then having R read the file, do the analysis and write
> the output to another file, and then have Perl parse the output file to
> insert the parameter estimates I need into the appropriate table), but that
> seems inefficient.
> Is it possible to do all I need with R working directly with MySQL?  If so,
> can someone fill in the apparent gaps left in the RMySQL documentation?
> Thanks.
> Ted


More information about the R-help mailing list