[R] including data frames in R packages

Gavin Simpson gavin.simpson at ucl.ac.uk
Mon Feb 25 10:17:31 CET 2008


On Sun, 2008-02-24 at 18:05 -0800, dxc13 wrote:
> useR's,
> 
> Does any one know if there is a size limitation on the data frames that can
> be included in R packages.  I have a data set in a text file that I would
> like to include in a package I am building and it is 8.5 MB in size.  Will
> this be problematic?  Is the process for including data sets in packages
> documented in WRE?
> 
> Thanks,
> dxc

Is the 8.5MB the size of the text file or the size of the saved object -
the objects can be compressed using the 'compress' argument to save,
which could save some space.

How much memory does the object occupy in memory and how much memory is
required to use it in examples? Not everyone has masses of RAM yet - I
was stung by that with an early version package I wrote a while back; I
hadn't considered memory usage of the examples and my poor laptop with
512MB of RAM took 10+ hours to run R CMD check on it because I quickly
got into swap hell, a process that completed in a few minutes on my main
development machine.

8.5MB isn't particularly large these days for most people but consider
that not all users are on fast ADSL connections and if your package is
likely to be popular and provided to CRAN, then there is load and
bandwidth on those servers to consider.

Also, 8.5MB of text file suggests quite a lot of data to be using in
examples. If you do, consider the execution time for your code - CRAN
runs checks on a host of architectures for all the packages stored
there, so if it takes ages to check your package because you are using a
large data set, that would be something to consider.

Then there is the issue of seeing the woods for the trees. If the data
set is intended to illustrate the package functions via examples, having
a simple example that is easily comprehended is far better than a more
complicated example. Having said that, 8.5MB may be typical for your
subject area, in which case this may be of less significance.

My two penneth,

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list