[R] Serverless databases in R

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Apr 19 13:46:20 CEST 2010


On Mon, 19 Apr 2010, Frank E Harrell Jr wrote:

> Barry Rowlingson wrote:
>> On Sun, Apr 18, 2010 at 11:30 PM, kMan <kchamberln at gmail.com> wrote:
>>> It was my understanding that .Rdata files were not very portable, and do 
>>> not
>>> natively handle queries. Otherwise we'd all just use .RData files instead 
>>> of
>>> farming the work out to SQL drivers & external libraries, and colleagues 
>>> who
>>> use, e.g. SAS or SPSS would also have no trouble with them.
>>
>>  The "platform" in "cross-platform" to me generally means the
>> operating system on which a program is running - and .Rdata files are
>> perfectly portable between R on Linux, MacOSX, Windows, Solaris etc
>> versions. You didn't mention portability to other statistical
>> packages. You also didn't mention needing SQL, or what you wanted to
>> do with your databases. I figured I'd just mention .Rdata files for
>> completeness!
>>
>>  There's also RJDBC and RODBC which can interface to anything with a
>> JDBC or ODBC interface on your system.
>>
>>  A .RData file could be considered as a serverless NoSQL database.
>> There's a GSOC proposal to investigate interfaces to NoSQL databases
>> and some info here:
>> 
>> http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:nosql_interface
>>
>>  Isn't it odd that the open-source R community has developed functions
>> for reading in proprietary SAS and SPSS format files, but (AFAIK) the
>> commercial sector doesn't seem to support reading data from
>> open-sourced and open-specced R .Rdata files?
>> 
>> Barry
>
> Hi Barry,
>
> Stat Transfer can read and write R binary data frames (.rda files).

Yes, but that is a considerable restriction (and other programs can do 
similar things).  I suspect it means 'data frames with columns from a 
prespecified small set of types' saved in an RDA2 gzipped binary xdr 
format.

BTW, .rda and .RData are simply convenient file extensions: the first 
is more convenient in the Windows world.  They are from one of a 
collection of many different formats, identified by the file 'magic' 
headers.

I am not so sure about 'open-specced R .Rdata files'.  In so far as 
there is a spec, I wrote it in 'R Internals' and it is not a full 
spec.  Mainly because many of the details are only relevant to R 
itself, such as how you read environments and some of the details of 
the object headers.

Had the RDA formats been written with the intent that they would be 
used other than to read all the objects they contain into R, they 
would have been structured differently with a lot more metadata. 
That has been noted for RDA3, but introducing such a format would be a 
major step and is not imminent.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list