[Rd] On implementing zero-overhead code reuse

Tue Oct 4 18:10:35 CEST 2016

Shower thoughts:

Are you digging for something like what you'd use with a CI/CD pipeline?
 e.g. - building a workflow that pulls a tag from a couple of code
repositories, checks them out into a workspace, installs prereqs, and then
runs your code/tasks in a repeatable fashion?

I'm not aware of a thing that is like a Gemfile or a Berksfile or a
package.json for R - but you can surely approximate that with a job step
that runs install.packages from a snippet of R code.

[I did have a quick glance at the install.packages docs to refresh my
memory -- it looks like it's biased toward installing the latest *unless*
you point it at something like an archive that has your package selections
frozen in time.  You can either store the deps yourself, or find an archive
that has historical snapshots by-date?  I would expect, really, that the
CRAN packages are unlikely to suddenly stop being version controlled or for
their history to vanish into the ether....   Maybe someone stores zfs
snapshots or similar of CRAN, on a date-by-date basis?  It should be cheap
(disk wise) to do...]

In my ideal world 'newer packages should mean more accurate results' --
running code with older package versions should mean that you're
duplicating the errors which to me seems not-useful in most cases....

best,

--e

On Mon, Oct 3, 2016 at 6:24 PM, Kynn Jones <kynnjo at gmail.com> wrote:

> Martin, thanks for that example.  It's definitely eye-opening, and
> very good to know.
>
> The installation business, however, is still a killer for me.  Of
> course, it's a trivial step in a simple example like the one you
> showed.  But consider this scenario:  suppose I perform an analysis
> that I may publish in the future, so I commit the project's state at
> the time of the analysis, and tag the commit with the KEEPER tag.
> Several months later, I want to repeat that exact analysis for some
> whatever reason.  If the code for the analysis was in Python (say),
> all I need to do is this (at the Unix command line):
>
>     % git checkout KEEPER
>     % python src/python/go_to_town.py
>
> ...knowing that the `git checkout KEEPER` command, *all by itself*,
> has put the working directory in the state I want it to be before I
> re-do the analysis.
>
> AFAICT, if the code for the analysis was in R, then `git checkout`, by
> itself, would *not* put the working directory in the desired state.  I
> still need to re-install all the R libraries in the repo.  And I
> better not forget to do this re-installation, otherwise I will end up
> running code different from the one I thought I was running.  (I find
> this prospect horrifying, for some reason.)
>
> A similar need to re-install stuff would arise whenever I update the repo.
>
> Please correct me if I'm wrong.
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]