[R] Reading and writing to S-like databases

Agustin Lobo alobo at ija.csic.es
Fri Sep 28 10:16:45 CEST 2001

Probably the suggestion by Jason ("using a relational database management
system") would be the best although complicated. As alternatives:

1. Save each object as a separate binary file. Create
another object (i.e. a 2 col matrix) that indexes each object
to its file. Attach the index object and, in your function,
attach only the required object. Unfortunately, this implies
that you must know what part (file) of the whole database you
need for a given operation (i.e., that if you need 
individual 3456 you know in which file you have the data for it).

2. You can use delay(). I'm almost done with a short document
on "Using R with large objects", for which I've got interesting
input from the list. In particular, I got the following message
by  Ray Brownrigg (Ray.Brownrigg at mcs.vuw.ac.nz):

A.  To set up an object so that it is available at all times, but only
loaded into memory when first referenced, consider the following:

test.x <- delay({attach(system.file("data", "test.rda", pkg="test"));

The object test.x has been created and saved as a .rda file using
save(test.x, file="test.rda"), and the resulting file test.rda has been
stored in the data directory of the (installed) package test.  Normally
the command above will be executed as part of loading the package test,
i.e. when library(test) is entered by the user at the R prompt.  Further,
because the object test.x is part of package test, it is not saved as
part of a new .RData when an R session is terminated, (as long as
nothing new is assigned to test.x during the session).

You could trick R with this delayed attachment of all the
objects of your database, but you would actually only attach
a given one if your processing really use it.

3. As I point in the document "Getting your styff organized in R",
I've not found any way to list the objects within a binary R file, 
nor to select particular objects from the binary file and attach
only the selected ones (which would be the best solution
in so many cases). I wonder if future R versions could consider
this feature.

Dr. Agustin Lobo
Instituto de Ciencias de la Tierra (CSIC)
Lluis Sole Sabaris s/n
08028 Barcelona SPAIN
tel 34 93409 5410
fax 34 93411 0012
alobo at ija.csic.es

On Thu, 27 Sep 2001, David Brahm wrote:

> Hi,
>    I asked this question 2 years ago, and would like to know if the answer has
> changed.
>    In S-Plus, I build databases of many large objects.  In any given analysis,
> I only need a few of those objects, but attach'ing the whole database is fine
> since objects are only read as needed.  How can I do the same thing in R,
> without reading the entire database?
>    One possibility is to treat the database as a package, devoid of code but
> containing many .RData files under /data, then load() each object I'll need.
> Perhaps autoload() can be used to avoid having to anticipate which objects I'll
> need?
>    Another is to use dput and dget.  Again I need to know ahead of time which
> objects I'll want.
>    On July 20, 1999, Ross Ihaka [mailto:ihaka at stat.auckland.ac.nz] wrote:
> > We are building the infrastructure for adding external databases which can be
> > attached in the S fashion. One of the class of external data base will be
> > that of S .Data directories.
> I'm not sure where that ended up -- could you clarify, Ross?  Thanks!
> 			-- David Brahm (a215020 at agate.fmr.com)
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list