[Rd] [RFC] A case for freezing CRAN
dtenenba at fhcrc.org
Wed Mar 19 23:11:53 CET 2014
----- Original Message -----
> From: "Joshua Ulrich" <josh.m.ulrich at gmail.com>
> To: "Jeroen Ooms" <jeroen.ooms at stat.ucla.edu>
> Cc: "r-devel" <r-devel at r-project.org>
> Sent: Wednesday, March 19, 2014 2:59:53 PM
> Subject: Re: [Rd] [RFC] A case for freezing CRAN
> On Wed, Mar 19, 2014 at 4:28 PM, Jeroen Ooms
> <jeroen.ooms at stat.ucla.edu> wrote:
> > On Wed, Mar 19, 2014 at 11:50 AM, Joshua Ulrich
> > <josh.m.ulrich at gmail.com>
> > wrote:
> >> The suggested solution is not described in the referenced article.
> >> It
> >> was not suggested that it be the operating system's responsibility
> >> to
> >> distribute snapshots, nor was it suggested to create binary
> >> repositories for specific operating systems, nor was it suggested
> >> to
> >> freeze only a subset of CRAN packages.
> > IMO this is an implementation detail. If we could all agree on a
> > particular
> > set of cran packages to be used with a certain release of R, then
> > it doesn't
> > matter how the 'snapshotting' gets implemented. It could be a
> > separate
> > repository, or a directory on cran with symbolic links, or a page
> > somewhere
> > with hyperlinks to the respective source packages. Or you can put
> > all
> > packages in a big zip file, or include it in your OS distribution.
> > You can
> > even distribute your entire repo on cdroms (debian style!) or do
> > all of the
> > above.
> > The hard problem is not implementation. The hard part is that for
> > reproducibility to work, we need community wide conventions on
> > which
> > versions of cran packages are used by a particular release of R.
> > Local
> > downstream solutions are impractical, because this results in
> > scripts/packages that only work within your niche using this
> > particular
> > snapshot. I expect that requiring every script be executed in the
> > context of
> > dependencies from some particular third party repository will make
> > reproducibility even less common. Therefore I am trying to make a
> > case for a
> > solution that would naturally improve reliability/reproducibility
> > of R code
> > without any effort by the end-user.
> So implementation isn't a problem. The problem is that you need a
> to force people not to be able to use different package versions than
> what existed at the time of each R release. I said this in my
> previous email, but you removed and did not address it: "However, you
> would need to find a way to actively _prevent_ people from installing
> newer versions of packages with the stable R releases." Frankly, I
> would stop using CRAN if this policy were adopted.
I don't see how the proposal forces anyone to do anything. If you have an old version of R and you still want to install newer versions of packages, you can download them from their CRAN landing page. As I understand it, the proposal only addresses what packages would be installed **by default** for a given version of R.
People would be free to override those default settings (by downloading newer packages as described above) but they should then not expect to be able to reproduce an earlier analysis since they'll have the wrong package versions. If they don't care, that's fine (provided that no other problems arise, such as the newer package depending on a feature of R that doesn't exist in the version you're running).
> I suggest you go build this yourself. You have all the code
> on CRAN, and the dates at which each package was published. If
> who care about reproducible research find what you've built useful,
> you will create the very community you want. And you won't have to
> force one single person to change their workflow.
> Joshua Ulrich | about.me/joshuaulrich
> FOSS Trading | www.fosstrading.com
> R-devel at r-project.org mailing list
More information about the R-devel