5

I am trying to schedule and R script to run inside a container. I have a docker file like this:

# Install R version 3.5
FROM rocker/tidyverse:3.5.1

USER root

# Install Ubuntu packages
RUN apt-get update && apt-get install -y \
    sudo \
    gdebi-core \
    pandoc \
    pandoc-citeproc \
    libcurl4-gnutls-dev \
    libcairo2-dev \
    libxt-dev \
    libssl-dev \
    xtail \
    wget \
    cron 



# Install R packrat, which we'll then use to install the other packages
RUN R -e 'install.packages("packrat", repos="http://cran.rstudio.com", dependencies=TRUE);'  


# copy packrat files
COPY  packrat/ /home/project/packrat/
# copy .Rprofile so that it know where to look for packages
COPY .Rprofile /home/project/
RUN R -e 'packrat::restore(project="/home/project");'

# Copy DB query script into the Docker image
COPY 002_query_db_for_kpis.R  /home/project/002_query_db_for_kpis.R
# copy crontab for db query
COPY db_query_cronjob /etc/crontabs/db_query_cronjob

# give execution rights
RUN chmod 644 /etc/crontabs/db_query_cronjob

# run the job
RUN crontab /etc/crontabs/db_query_cronjob


# start cron in the foreground 
CMD ["cron", "-f"]

It builds ok and then the cron job fails silently. When I investigate with:

docker exec -it   19338f50b4ed  Rscript `/home/project/002_query_db_for_kpis.R`

The output I get is:

Error in library(zoo) : there is no package called ‘zoo’
Execution halted

Now, the first part of the scripts looks like:

#!/usr/local/bin/env Rscript --default-packages=zoo,RcppRoll,lubridate,broom,magrittr,tidyverse,rlang,RPostgres,DBI

library(zoo)

...

So, clearly it's not finding the packages. They are in there though. That was the whole point of packrat and copying the .Rprofile, and it seemed to work because if I run a shell inside the container while it's running I can find them in:

root@d2b4f6e7eade:/usr/local/lib/R/site-library# 

and all the packrat files seem in the right place as well.. could it be that the .Rprofile file isn't being seen because it starts with a '.'? Can I change that?

UPDATE

If I don't use packrat, but install packages normally, it works. Digging around inside the container's files, I can see that /usr/local/lib/R/site-library doesn't have the packages needed in it, whereas /home/project/packrat/src does. So, it must be to do with Rscript looking in the wrong place. I thought the .Rprofile in /home/project would solve that but it doesn't.. maybe something else I didn't copy over? Although I've got the script running now, it's not ideal since, those packages might be different versions (hence why I want to use packrat), so if anyone can figure out how to get it to work with packrat I'll mark that answer as correct.

Tom Greenwood
  • 1,502
  • 14
  • 17

1 Answers1

1

A couple things to try based on problem and update:

  1. have you ignored your packrat/lib* and packrat/src/ directories in .dockerignore? i am worried you are copying over all the built packages and so restore() thinks the packages already been built in your container.

  2. does your root container have executable privs on the packrat.lock file? obviously would prevent restore from running.

change docker install user to the rocker rstudio image's default "rstudio", moves just the packrat.lock and packrat.opts files

USER rstudio
COPY --chown=rstudio:rstudio packrat/packrat.* /home/project/packrat/

A good reference for these options: https://rviews.rstudio.com/2018/01/18/package-management-for-reproducible-r-code/

  • Thanks for your answer. Don't think the problem is to do with packrat restoring the packages however, since I can see that being installed during the docker build. The issues is at rscript can't find them. I think it might be to do with #!/usr/local/bin/env Rscript at the top of the script point to the wrong place, but I haven't had a chance to try changing that cos other stuff has got in the way. Thanks again for your response, and I'll let you know once I'd had a chance to it out. – Tom Greenwood Feb 25 '19 at 09:54