[Rd] enabling reproducible research & R package management & install.package.version & BiocLite

Cook, Malcolm MEC at stowers.org
Mon Mar 4 22:13:18 CET 2013


In support of reproducible research at my Institute, I seek an approach to re-creating the R environments in which an analysis has been conducted.

By which I mean, the exact version of R and the exact version of all packages used in a particular R session.

I am seeking comments/criticism of this as a goal, and of the following outline of an approach:

=== When all the steps to an workflow have been finalized ===
* re-run the workflow from beginning to end
* save the results of sessionInfo() into an RDS file named after the current date and time.

=== Later, when desirous of exactly recreating this analysis ===
* read the (old) sessionInfo() into an R session
* exit with failure if the running version of R doesn't match
* compare the old sessionInfo to the currently available installed libraries (i.e. using packageVersion)
* where there are discrepancies, install the required version of the package (without dependencies) into new library (named after the old sessionInfo RDS file)

Then the analyst should be able to put the new library into the front of .libPaths and run the analysis confident that the same version of the packages.

I have in that past used install-package-version.R  to revert to previous versions of R packages successfully (https://gist.github.com/1503736).  And there is a similar tool in Hadley Wickhams devtools.

But, I don't know if I need something special for (BioConductor) packages that have been installed using biocLite and seek advice here.

I do understand that the R environment is not sufficient to guarantee reproducibility.   Some of my colleagues have suggested saving a virtual machine with all your software/library/data installed. So, I am also in general interested in what other people are doing to this end.  But I am most interested in:

* is this a good idea
* is there a worked out solution
* does biocLite introduce special cases
* where do the dragons lurk

... and the like

Any tips?


~ Malcolm Cook
Stowers Institute / Computation Biology / Shilatifard Lab

More information about the R-devel mailing list