[Rd] Docker versus Vagrant for reproducability - was: The case for freezing CRAN

Rainer M Krug Rainer at krugs.de
Fri Mar 21 16:12:16 CET 2014


Gábor Csárdi <csardi.gabor at gmail.com> writes:

> You might want to look at packer as well, which can build virtual machines
> from an ISO, without any user intaraction. I successfully used it to build
> VMs with Linux, OSX and Windows. It can also create vagrant boxes. You can
> specify provisioners, e.g. to install R, or a set of R packages, etc. It is
> under heavy development, by the same team as vagrant.

I think I am getting lost in these - I looked ad Docker, and it looks
promising, but I actually didn't even manage to sh into the running
container. Is there somewhere an howto on how one can use these in R, to
the purpose discussed in this thread? If not, I really think this would
be needed. It is extremely difficult for me to translate what I want to
do into the deployment / management / development scenarios discussed in
the blogs I have found.

Cheers, 

(a confused)
Rainer


>
> Gabor
>
> On Fri, Mar 21, 2014 at 9:03 AM, Philippe GROSJEAN <
> Philippe.GROSJEAN at umons.ac.be> wrote:
>
>>
>> ..............................................<}))><........
>>  ) ) ) ) )
>> ( ( ( ( (    Prof. Philippe Grosjean
>>  ) ) ) ) )
>> ( ( ( ( (    Numerical Ecology of Aquatic Systems
>>  ) ) ) ) )   Mons University, Belgium
>> ( ( ( ( (
>> ..............................................................
>>
>> On 21 Mar 2014, at 10:59, Rainer M Krug <Rainer at krugs.de> wrote:
>>
>> > Dirk Eddelbuettel <edd at debian.org> writes:
>> >
>> >> o Roger correctly notes that R scripts and packages are just one issue.
>> >>   Compilers, libraries and the OS matter.  To me, the natural approach
>> these
>> >>   days would be to think of something based on Docker or Vagrant or (if
>> you
>> >>   must, VirtualBox).  The newer alternatives make snapshotting very
>> cheap
>> >>   (eg by using Linux LXC).  That approach reproduces a full environemnt
>> as
>> >>   best as we can while still ignoring the hardware layer (and some
>> readers
>> >>   may recall the infamous Pentium bug of two decades ago).
>> >
>> > These two tools look very interesting - but I have, even after reading a
>> > few discussions of their differences, no idea which one is better suited
>> > to be used for what has been discussed here: Making it possible to run
>> > the analysis later to reproduce results using the same versions used in
>> > the initial analysis.
>> >
>> > Am I right in saying:
>> >
>> > - Vagrant uses VMs to emulate the hardware
>> > - Docker does not
>> >
>> Yes.
>>
>>
>> > wherefore
>> > - Vagrant is slower and requires more space
>> > - Docker is faster and requires less space
>> >
>> It depends. For instance, if you run R in VirtualBox under Windows, it may
>> run faster depending on the code you run and, say, the Lapack library used.
>> On Linux, you typically got R code run in the VM 2-3% slower than natively,
>> but In a Windows host, most of my R code runs faster in the VM... But yes,
>> you need more RAM.
>>
>> With Vagrant, you do not need to keep you VM once you don't use it any
>> more. Then, disk space is shrunk down to a few kB, corresponding to the
>> Vagrant configuration file. I guess the same is true for Docker?
>>
>> A big advantage of Vagrant + VirtualBox is that you got a very similar
>> virtual hardware, no matter if your host system is Linux, Windows or Mac OS
>> X. I see this as a good point for better reproducibility.
>>
>>
>> > Therefore, could one say that Vagrant is more "robust" in the long run?
>> >
>> May be,... but it depends almost entirely how VirtualBox will support old
>> VMs in the future!
>>
>> PhG
>>
>> > How do they compare in relation to different platforms? Vagrant seems to
>> > be platform agnostic, I can develop and run on Linux, Mac and Windows -
>> > how does it work with Docker?
>> >
>> > I just followed [1] and setup Docker on OSX - loos promising - it also
>> > uses an underlying VM. SO both should be equal in regards to
>> > reproducability in the long run?
>> >
>> > Please note: I see these questions in the light of this discussion of
>> > reproducability and not in regards to deployment of applications what
>> > the discussions on the web are.
>> >
>> > Any comments, thoughts, remarks?
>> >
>> > Rainer
>> >
>> >
>> > Footnotes:
>> > [1]  http://docs.docker.io/en/latest/installation/mac/
>> >
>> > --
>> > Rainer M. Krug
>> > email: Rainer<at>krugs<dot>de
>> > PGP: 0x0F52F982
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> 	[[alternative HTML version deleted]]
>

-- 
Rainer M. Krug
email: Rainer<at>krugs<dot>de
PGP: 0x0F52F982
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 494 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20140321/64e36fe4/attachment.bin>


More information about the R-devel mailing list