[R] How do people use Sweave / R / Databases

Marc Schwartz marc_schwartz at me.com
Fri Jan 29 20:56:24 CET 2010


On Jan 29, 2010, at 1:28 PM, Paul wrote:

> I'm currently using r scripts in sweave to grab some data via ODBC, process it then generate some tables.  I'd like to be able to give someone the files and let them reproduce what I've done.  Is there some way to store the data that is gathered by ODBC so that the second person can recreate the work without the database (apart from just writing it to a file or into the document)
> 
> Thanks
> 
> Paul.

There are likely to be several approaches, as there always is with R, but here is mine.

I have a function that I wrote to (via RODBC) obtain the entire content of a series of Oracle views from our server. Basically a loop of "select * from VIEWNAME" queries, where the changing parts of the query are paste()d together. The function takes a viewname prefix to define a common set of views, gets the names of the views that match the prefix pattern from a table that stores all viewnames and then creates a series of data frames in the R global environment via assign(), with one data frame per Oracle view.

The function is called from within a code chunk in a .Rnw file, which by default is set to "<<results=hide,eval=FALSE>>=".

I do this so that I can run the code chunk once manually to secure the data. Note that I use ESS for all of this, so I just highlight that code and send it to the R buffer (session).

I then save the data frames (Oracle views) to a .RData file using save() to preserve the "clean" source data.

I can then run the rest of the .Rnw file as much as I want, changing code as I need to, but always using the same set of data, since the code chunk containing the function that gets the Oracle data is not evaluated on subsequent runs. 

I also use subversion to version control changes made to the .Rnw file as required and to store the original data set. Since I will perform updated retrievals over time, I have the flexibility of running the .Rnw file (or a revised version) on old versions of the data retrieval as I need.

So, using this approach, you can pass the .RData file off to your colleague, along with the .Rnw file and they should be good to go. 

HTH,

Marc Schwartz



More information about the R-help mailing list