Different behaviors between a Docker image and the same image imported in singularity

Question

I have recently started using Docker to secure the computational reproducibility of my research. Since the HPC service at my institution only supports singularity, I want to import a Docker image within singularity when I perform part of my analysis using the HPC. When I did this, however, I found that the results based on the original Docker image differ from those based on the Docker image imported in singularity.

Here is what I did to build a simple Bayesian regression model based directly on a Docker image. This was run locally and also on an instance at AWS, resulting in identical output (as expected).

docker pull akiramurakami/gramm-mor:v1.0
docker run -it akiramurakami/gramm-mor:v1.0 bash
Rscript -e 'library("brms"); library("tidyverse"); set.seed(1); d <- tibble(x = rnorm(100), y = 2 * x - 1 + rnorm(100)); m <- brm(y ~ x, data = d, seed = 1); summary(m)'

Below is part of the output.

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.04      0.10    -1.23    -0.85 1.00     3812     2469
x             2.00      0.11     1.79     2.21 1.00     4625     3037

Here’s what I did on HPC, using singularity.

singularity pull docker://akiramurakami/gramm-mor:v1.0
singularity exec gramm-mor_v1.0.sif Rscript -e 'library("brms"); library("tidyverse"); set.seed(1); d <- tibble(x = rnorm(100), y = 2 * x - 1 + rnorm(100)); m <- brm(y ~ x, data = d, seed = 1); summary(m)'

And the results are different (see Bulk_ESS and Tail_ESS columns).

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -1.04      0.10    -1.23    -0.84 1.00     3798     2826
x             2.00      0.11     1.78     2.22 1.00     4275     2913

Why is this and is there a way to import and use a Docker image in singularity so that it yields the same results as those based on the original Docker image?

Below is the Dockerfile used.

FROM rocker/r-ver:3.6.3
LABEL "maintainer"="xxx"

RUN apt-get update -qq && apt-get -y --no-install-recommends install \
  file \
  git \
  libapparmor1 \
  libclang-dev \
  libcurl4-openssl-dev \
  libedit2 \
  libssl-dev \
  lsb-release \
  multiarch-support \
  psmisc \
  procps \
  python-setuptools \
  sudo \
  wget \
  libxml2-dev \
  libcairo2-dev \
  libsqlite-dev \
  libmariadbd-dev \
  libmariadbclient-dev \
  libpq-dev \
  libssh2-1-dev \
  unixodbc-dev \
  libsasl2-dev \
  clang
  
# https://github.com/stan-dev/rstan/wiki/Installing-RStan-on-Linux
RUN Rscript -e 'dotR <- file.path(Sys.getenv("HOME"), ".R"); \
  if (!file.exists(dotR)) dir.create(dotR); \
  M <- file.path(dotR, "Makevars"); \
  if (!file.exists(M)) file.create(M); \
  cat("\nCXX14FLAGS=-O3 -march=native -mtune=native -fPIC","CXX14=clang++",file = M, sep = "\n", append = TRUE)'

RUN Rscript -e 'options(repos = list(CRAN = "http://mran.revolutionanalytics.com/snapshot/2020-07-01")); \
  install.packages(c("brms", "data.table", "devtools", "SnowballC", "tidyverse", "dplyr"))'

Update on the 29th of August, 2020:
I have asked the same question at the Stan Forums and received some useful comments (although the exact reason for the concerned difference still remains unclear).

I can't offer much more than a guess, but ... it has a lot of dependencies, there's a reasonable chance that they change some of their performance assumptions based on the operating environment, including CPUs/cores and their multi-threading capacity. — r2evans, Aug 19 '20 at 23:44
Wouldn't I then expect different results between my local machine (OS X) and an instance at AWS (RHEL) as well? I got identical results using the same Docker image in those two environments, which led me to suspect that the difference observed is due to the difference between Docker and singularity. I'm new to containerization, so might be wrong. — Akira Murakami, Aug 20 '20 at 19:36
I don't know, but you haven't talked about different *resources* on both. For instance, how many CPUs and threads? How much RAM? I can't list them all off-hand, but I know that some heuristics (specifically in optimization, but others exist) will make different heuristic decisions based on different resources. For instance, if you have a gazillion threads, then a heuristic might recommend a brute-force attempt at something, whereas with 1-2 threads, it might go with something a little more conservative. — r2evans, Aug 20 '20 at 21:14
Further, I know *nothing* about `singularity`, so I'm just throwing out guesses, sorry I don't have something more insightful. — r2evans, Aug 20 '20 at 21:15
I can reproduce your issue, but I am not sure why it is happening. Setting the seed makes the individual results repeatable within their container type, but there's still that small difference between the two. My guess is that there's something about stan's sampling that is handled slightly differently (singularity's sampling runs after the first are generally faster than docker's). I'd suggest raising the question at the stan issues and see if they have more insight. — tsnowlan, Aug 24 '20 at 08:38
Thanks! Asking at the Stan Forums sounds like a good idea and I'll do it. — Akira Murakami, Aug 25 '20 at 00:26

Different behaviors between a Docker image and the same image imported in singularity

0 Answers0