[Rd] Problem with distributing data in package.

Hervé Pagès hpages at fhcrc.org
Mon Jul 22 21:04:04 CEST 2013

Hi there,

On 07/20/2013 10:25 AM, Prof Brian Ripley wrote:
> On 20/07/2013 12:01, Simon Knapp wrote:
>> Hi Barry,
>> Thanks for the response, your suggestion was going to be my 'work
>> around'... perhaps I took the second paragraph of section 1.1.5 of
>> R-exts.pdf the wrong way.
>> I'd be interested in knowing why there is a difference between the
>> data in a source package (.rda files) and windows binary package (.R
>> files) if anyone can tell me.
> We have little idea what you did.  Apparently you feel you are exempted
> from the request in the posting guide for a reproducible example.
> At a wild guess, you did not understand the point of R CMD build
> --no-resave-data and did not want your data resaved as a .rda file (the
> documented default).  But there is no mention of running 'R CMD build'
> here.

Having the data resaved as a .rda file by default seems to defeat the
purpose of supporting executable code in the data folder. Maybe there
are good reasons for doing this on CRAN (and it's not even clear what
those reasons could be, since the size of the code that generates the
data is generally much smaller than the data it generates) but it would
probably make more sense to not have 'R CMD build' do this by default
and to let the author of the package be happy with its decision to
dynamically generate the data.


>> On Sat, Jul 20, 2013 at 4:58 PM, Barry Rowlingson
>> <b.rowlingson at lancaster.ac.uk> wrote:
>>> On Fri, Jul 19, 2013 at 10:33 AM, Simon Knapp
>>> <sleepingwell at gmail.com> wrote:
>>>> Hi List,
>>>> I am building a package for a client to help them create and perform
>>>> analyses against netcdf files which contain 'a temporal stack' of
>>>> grids.
>>>> For my examples and test cases, I create an example dataset in code
>>>> (as this is a lot more space efficient than providing raw data). The
>>>> code creates a netcdf file in tempdir() and an object of class 'ncdf'
>>>> in the global namespace. I have placed the code in a .R file in the
>>>> data directory of my package and 'load' it with a call to data().
>>>   Why not just put the function that generates the data file into the
>>> usual place (/R/ folder) and document it so that the user knows to run
>>> 'sampledata=makeSampleNCDF()' before doing things that need it?
>>>   Trying to put executable code into the data folder does seem a bit
>>> perverse!
> Actually, it seems to be *interpretable* R code in a .R file.
> The help file for data() says:
> Details:
>       Currently, four formats of data files are supported:
>         1. files ending ‘.R’ or ‘.r’ are ‘source()’d in, with the R
>            working directory changed temporarily to the directory
>            containing the respective file.  (‘data’ ensures that the
>            ‘utils’ package is attached, in case it had been run _via_
>            ‘utils::data’.)
> ...
> and (in so far as we can tell without the requested example) that was
> the 'format' intended to be used.
>>> Barry

Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

More information about the R-devel mailing list