[Rd] [RFC] A case for freezing CRAN
jari.oksanen at oulu.fi
Fri Mar 21 09:33:32 CET 2014
Freezing CRAN solves no problem of reproducibility. If you know the sessionInfo() or the version of R, the packages used and their versions, you can reproduce that set up. If you do not know, then you cannot. You can try guess: source code of old release versions of R and old packages are in CRAN archive, and these files have dates. So you can collect a snapshot of R and packages for a given date. This is not an ideal solution, but it is the same level of reproducibility that you get with strictly frozen CRAN. CRAN is no the sole source of packages, and even with strictly frozen CRAN the users may have used packages from other source. I am sure that if CRAN would be frozen (but I assume it happens the same day hell freezes), people would increasingly often use other package sources than CRAN. The choice is easy if the alternatives are to wait for the next year for the bug fix release, or do the analysis now and use package versions in R-Forge or github. Then you could not assume that frozen CRAN packages were used.
CRAN policy is not made in this mailing list, and CRAN maintainers are so silent that it hurts ears. However, I hope they won't freeze CRAN.
Strict reproduction seems to be harder than I first imagined: ./configure && make really failed for R 2.14.1 and older in my office desktop. To reproduce older analysis, I would also need to install older tool sets (I suspect gfortran and cairo libraries).
CRAN is one source of R packages, and certainly its policy does not suit all developers. There is no policy that suits all. Frozen CRAN would suit some, but certainly would deter some others.
There seems to a common sentiment here that the only reason anybody would use R older than 3.0.3 is to reproduce old results. My experience form the Real Life(™) is that many of us use computers that we do not own, but they are the property of our employer. This may mean that we are not allowed to install there any software or we have to pay, or the Department of project has to pay, to the computer administration for installing new versions of software (our case). This is often called security. Personally I avoid this by using Mac laptop and Linux desktop: these are not supported by the University computer administration and I can do what I please with these, but poor Windows users are stuck. Computer classes are also maintained by centralized computer administration. This January they had new R, but last year it was still two years old. However, users can install packages in their personal "folders" so that they can use current packages even with older R. Therefore I want to take care that the packages I maintain also run in older R. Therefore I also applaud the current CRAN policy where new versions of packages are "backported" to previous R release: Even if you are stuck with stale R, you need not be stuck with stale packages. Currently I cannot test with older R than 2.14.2, though, but I do that regularly and certainly before CRAN releases. If somebody wants to prevent this, they can set their package to unnecessarily depend on the current version of R. I would regard this as antisocial, but nobody would ask what I think about this so it does not matter.
The development branch of my package is in R-Forge, and only bug fixes and (hopefully) non-breaking enhancements (isolated so that they do not influence other functions, safe so that API does not change or format of the output does not change) are merged to the CRAN release branch. This policy was adopted because it fits the current CRAN policy, and probably would need to change if CRAN policy changes.
Cheers, Jari Oksanen
More information about the R-devel