6

Problem

I have a multistage (two stages) docker build for my container, lets name it cont, that I want so automate via GitHub Actions. The first stage/docker-image of the build-process does seldomly change and takes very long to build; lets call this cont-build. I want to to reduce build duration by not building cont-build every time I build the whole project.

When running that build locally, I have the image cont-build easily available through my local docker instance. I struggle to transfer this simple availability to GitHub Actions.

I checked the Docker and GitHub docs, but was unable to find a way of implementing this. It is so simple on a local machine, so I thought it cannot be that hard on GitHub-Actions...

Approach

To persist the cont-build image, there seem to be different approaches

  • Use some sort of GitHub cache. I am not sure about the duration images are cached for.
  • Pull image from DockerHub, which in the case of long build times may be much faster than building

The second one seems more straight forward and less complex to me. So my approach was to publish cont-build to DockerHub and pull cont-build in the GitHub Action every time I want to build cont.

I tried using uses: Docker://${{ secrets.DOCKERHUB_USERNAME }}/cont-build, but do not know where to place it.

Question

Where/how do I pull the image cont-build that is required by the Dockerfile-cont "Build and push" in the workflow below? Also, if my approach is bad, how is the general approach to multi-stage builds where one stage of the build does not/seldomly change, especially taking into account the fact that GitHub-caches might be deleted after a while?

I realise that I can use something like FROM mydockerID/cont-build:latest in Dockerfile-cont, but that does not seem to be the solution that leverages the whole GitHub-Workflow environment. This would also mean that I have to enter my docker-ID in clear text as opposed to using a GitHub-Secret.

name: CI for cont

on: workflow_dispatch

jobs:
  docker:
    runs-on: ubuntu-latest    
    steps:
      -
        name: Checkout
        uses: actions/checkout@v2      
      -
        name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1        
      -
        name: Login to DockerHub
        uses: docker/login-action@v1         
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
      -
        name: Build and push
        id: docker_build
        uses: docker/build-push-action@v2     
        with:
          context: ./Code/
          file: ./Code/Dockerfile-cont
          push: true
          tags: ${{ secrets.DOCKERHUB_USERNAME }}/cont:latest
      -
        name: Image digest                  
        run: echo ${{ steps.docker_build.outputs.digest }}

joba2ca
  • 169
  • 2
  • 14

1 Answers1

3

The problem with multi-stage builds is that if you want caching to work you need:

  1. Access to the intermediate stages as well as part of the rebuild.
  2. To use --cache-from to refer to the previous images, including intermediate steps.

If you think about how rebuilds would work, if you are missing intermediate stages the builder will go "huh I guess I don't have that in cache" and rebuild; it can't tell if final stage would need to be rebuilt or not until it's gone through all previous steps.

So you need to do the following song and dance, assuming two stages, "build" and runtime:

  1. Pull "yourimage:latest" and "yourimage:build".
  2. Build and tag each intermediate stage, e.g. "yourimage:build", "yourimage:latest", with --cache-from=yourimage:build --cache-from=yourimage:latest.
  3. Push both those images.

You can see specific details and more extended explanation, and example solution, at https://pythonspeed.com/articles/faster-multi-stage-builds/

Itamar Turner-Trauring
  • 3,430
  • 1
  • 13
  • 17