[Rd] Depending/Importing data only packages

Paul Gilbert pgilbert902 at gmail.com
Sat Dec 7 23:08:10 CET 2013

On 13-12-07 01:47 PM, Gabor Grothendieck wrote:
> On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert <pgilbert902 at gmail.com> wrote:
>> On 13-12-07 12:19 PM, Gábor Csárdi wrote:
>>> I don't know about this particular case, but in general it makes sense
>>> to rely on a data package. E.g. I am creating a package that does
>>> Bayesian inference for a particular problem, potentially relying on
>>> prior knowledge. I think it makes sense to put the data that is used
>>> to calculate the prior into another package, because it will be larger
>>> than the code, and it does not change that often.
>>> Gabor
>>> On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert <pgilbert902 at gmail.com>
>>> wrote:
>>>> Would "Suggests" not work in this situation? I don't understand why you
>>>> would need Depends. In what sense do you rely on the data only package?
>> HW> Because I want someone who downloads the package to be able to run
>> HW> the examples without having to take additional action.
>> HW>
>> HW> Hadley
>> I went through this myself, including thinking it was a nuisance for users
>> to need to attach other packages to run examples. In the end I decided it is
>> not so bad to be explicit about what package the example data comes from, so
>> illustrate it in the examples. Users may not always want this data, and
>> other packages that build on yours probably do not want it.
>> Even in the Bayesian inference case pointed out by Gábor, I am not
>> convinced. It means the prior knowledge base cannot be exchanged for another
>> one. The package would be more general if it allowed the possibility of
>> attaching a different database of prior information. But this is clearly a
>> more important case, since the code probably does not work without some
>> database. (There are a few other situations where something like
>> "RequireOneOf:" would be useful.)
> Requiring users to load packages which could be loaded automatically
> seems to go against ease of use.  Its just one more thing that they
> have to remember to do.
> It really should be possible to write a "batteries included" package
> while leveraging off of other packages.
Just to be clear, I distinguish the "batteries included" situation from 
the "spare batteries included" situation. I think it should be possible 
to automatically load everything that is really needed, that is why I 
think the Bayesian database is a more important case. But it strikes me 
as bad to attach everything that could ever possibly be wanted by a 
user. After all, it would be possible to automatically attach all 
packages. Some packages seemed to be headed in that direction before the 
new rules started to be enforced.

There is certainly a trade-off here between ease of use, not needing the 
user to attach packages, and namespace conflicts, which will result in 
time and difficulty debugging. For packages that no one ever uses in 
other packages, there would be a tendency to lean toward ease of use. 
But as soon as anyone starts building on top of a package with another 
one, I think that avoiding potential conflicts will dominate.


More information about the R-devel mailing list