3

I am not completely sure if Docker is enough for R development or I should use in in conjunction with Packrat. I have read several posts that state that docker is sufficient. The only place that support this claim is this post. However I was not able to build that example due to errors in the git2r installation.

My overall goal is to have full control of the package versions I use, so my analysis will still work even if the package is later upgraded.

2 Answers2

2

You need both. Think that the docker image is just the final product of your source code, including the Dockerfile and every piece of data used to build the final image.

You should pin the docker (avoid FROM blah:latest) base image to be sure that the underlying libraries and tools will be always the same. Don't use base images such as debian/testing that may change on every run of apt-get install.

If you don`t use packrat when you need to rebuild your image you may get a new piece of code from some library that is not working anymore, for instance, think about a deprecated function you may have used.

And of course version your own code, at least tag it to be able to easily go back in time and start a new build again.

This is the minimum you can do because things like broken Dockerhub or CRAN repositories still can happen. Saving a versioned docker image in a private docker registry is just the final step.

Leo
  • 1,102
  • 2
  • 11
  • 18
  • So would the following be a good pipeline: 1. Build and run docker starting with FROM rocker/rstudio in the dockerfile 2. Init the project with packrat and start adding libraries 3. Code the rest, or do all the installations inside the dockerfile? – Vangelis Theodorakis Jun 21 '18 at 15:47
  • I am not talking about developing inside a docker container. If you intend to deliver the final docker image you should do all the required installation while building the image, I assume this is your last step, with packrat this should become a `packrat::restore()`. – Leo Jun 21 '18 at 15:59
  • So your add the packrat and install all the other libraries as you would normally do, right the code locally as you always do and then create the image, right? – Vangelis Theodorakis Jun 21 '18 at 17:54
1

Let's say you use a certain docker image to do your analysis now. If you later start the same docker image, i.e. not just the same name (e.g. rocker/rstudio) or the same version (e.g. rocker/rstudio:3.5.0) but the same image id, you are guaranteed to get the exact same versions of R, R packages and system libraries. This is more than what packrat offers (same R package versions), but requires you to save the docker image.

Ralf Stubner
  • 26,263
  • 3
  • 40
  • 75