6

Suppose I am building an image using Docker Buildkit. My image is from a multistage Dockerfile, like so:

FROM node:12 AS some-expensive-base-image
...

FROM some-expensive-base-image AS my-app
...

I am now trying to build both images. Suppose that I push these to Docker Hub. If I were to use Docker Buildkit's external caching feature, then I would want to try to save build time on my CI pipeline by pulling in the remote some-expensive-base-image:latest image as the cache when building the some-expensive-base-image target. And, I would want to pull in both the just-built some-expensive-base-image image and the remote my-app:latest image as the caches for the latter image. I believe that I need both in order to prevent requiring the steps of some-expensive-base-image from needing to be rebuilt, since...well...they are expensive.

This is what my build script looks like:

export DOCKER_BUILDKIT=1
docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from some-expensive-base-image:latest --target some-expensive-base-image -t some-expensive-base-image:edge .
docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from some-expensive-base-image:edge --cache-from my-app:latest --target my-app -t my-app:edge .

My question: Does the order of the --cache-from arguments matter for the second docker build?

I have been getting inconsistent results on my CI pipeline for this build. There are cache misses that are happening when building that latter image, even though there hasn't been any code changes that would have caused cache busting. The Cache Minefest can be pulled without issue. There are times when the cache image is pulled, but other times when all steps of that latter target need to be rerun. I don't know why.

By chance, should I instead try to docker pull both images before running the docker build commands in my script?

Also, I know that I referred to Docker Hub in my example, but in real life, my application uses AWS ECR for its remote Docker repository. Would that matter for proper Buildkit functionality?

ecbrodie
  • 11,246
  • 21
  • 71
  • 120

1 Answers1

3

Yes, the order of --cache-from matters!

See the explanation on Github from the person who implemented the feature, quoting here:

When using multiple --cache-from they are checked for a cache hit in the order that user specified. If one of the images produces a cache hit for a command only that image is used for the rest of the build.

I've had similar problems in the past, you might find useful to check ths answer, where I've shared about using Docker cache in the CI.

Elias Dorneles
  • 22,556
  • 11
  • 85
  • 107
  • Thanks for the reply. I have switched employers since I posted this, so I'm no longer working on a project where I maintain Docker builds personally. But anyway, my team at the time had found the Docker Buildkit caching capabilities with `--cache-from` to be too cumbersome and buggy, therefore we just gave up with using that arg for our builds. – ecbrodie Jun 24 '21 at 01:27
  • Also, the answer from the `--cache-from` feature author was from 2017, almost 4 years ago. It could be quite possible that the behaviour (specifically, whether Docker pulls if the image is not found in the local cache) has changed since then. – ecbrodie Jun 24 '21 at 01:29