[R] RJDBC to OpenOffice Calc as RODBC to MS Excel

Marc Schwartz marc_schwartz at comcast.net
Fri Dec 14 14:28:23 CET 2007


On Fri, 2007-12-14 at 07:31 +0000, Prof Brian Ripley wrote:
> On Fri, 14 Dec 2007, Metz, Thomas (IRRI) wrote:
> 
> > Under Windows, I have used RODBC to connect to Excel spreadsheets as per
> > the example below:
> >
> > library(RODBC);
> > connect = odbcConnectExcel("testdata.xls");
> > query = "SELECT [data$.ethn], [data$.sex], [data$.age],
> >                [data$.height], [data$.weight],
> >                [label$.label]
> >         FROM [data$], [label$]
> >         WHERE [data$.ethn] = [label$.ethn];"
> > data = sqlQuery(connect, query);
> > odbcClose(connect);
> >
> > [data$] and [label$] are two named sheets in the Excel spreadsheet
> > testdata.xls. [.ethn], [.sex], [.age], [.height], [.weight], and
> > [.label] are cloumn names that appear in the first row in the sheets. I
> > can also have UNION queries that allow me to overcome the spreadsheet
> > row limitation of a single sheet. The idea is to allow normalization of
> > data in a spreadsheet and leveraging the power of SQL, without using a
> > database.
> >
> > Can the same be done under Windows (Linux?) with OpenOffice Calc using
> > RJDBC? Are there ODBC drivers for OpenOffice Calc?
> 
> An awful lot of that is Microsoft warts on SQL, so it will not be 
> portable.  But in a more standard syntax (drop the [] and $) it should be 
> doable over any connection that supports SQL queries.
> 
> The question is whether OO calc has suitable drivers as an ODBC/JDBC 
> server.  Not a question for this list!  (I suspect the answer is no: 
> Microsoft's drivers effectively use the Access engine to work with 
> spreadsheet files and even plain text.  I don't even see drivers for OO 
> base.)

There are none to my knowledge.  Most of OO.org's ODBC/JDBC integration
is one-way. That is, it can connect within the suite and to external
sources, but does not appear to provide connectivity to enable external
applications to acquire data stored within OO.org's apps.

Base is an embedded version of HSQLDB (http://www.hsqldb.org), which is
a java based application. In theory, it would support a JDBC interface,
but I have seen none and when the subject comes up on the OO.org lists,
no solutions are forthcoming. It's too bad, at least within this
context, that the OO.org folks elected to use HSQLDB rather than SQLite,
which was the the other option under consideration. Perhaps Sun's
influence, vis-a-vis Java, won the day here.

Bearing in mind that OO.org's Write and Calc documents are just 'zipped'
XML files, it would be possible to parse the data stored within such
documents. I suspect Max Kuhn has spent much time on this for odfWeave.

There are Perl modules that can provide a level of interaction here. For
example, OpenOffice::Parse::SXC
(http://search.cpan.org/~dclee/OpenOffice-Parse-SXC-0.03/SXC.pm)
provides the means to parse a Calc file directly, without needing the
OO.org API.

One could wrap that module in an R function via a system() call and then
interact with a Calc file directly.

HTH,

Marc Schwartz



More information about the R-help mailing list