28

On my local machine, I have built the latest image, and running another docker build uses cache everywhere it should.

Then I upload the image to the registry as the latest, and then on my CI server, I'm pulling the latest image of my app in order to use it as the build cache to build the new version :

docker pull $CONTAINER_IMAGE:latest

docker build --cache-from $CONTAINER_IMAGE:latest \
             --tag $CONTAINER_IMAGE:$CI_COMMIT_SHORT_SHA \
             .

From the build output we can see the COPY of the Gemfile is not using the cache from the latest image, while I haven't updated that file :

Step 15/22 : RUN gem install bundler -v 1.17.3 &&     ln -s /usr/local/lib/ruby/gems/2.2.0/gems/bundler-1.16.0 /usr/local/lib/ruby/gems/2.2.0/gems/bundler-1.16.1
 ---> Using cache
 ---> 47a9ad7747c6
Step 16/22 : ENV BUNDLE_GEMFILE=$APP_HOME/Gemfile     BUNDLE_JOBS=8
 ---> Using cache
 ---> 1124ad337b98
Step 17/22 : WORKDIR $APP_HOME
 ---> Using cache
 ---> 9cd742111641
Step 18/22 : COPY Gemfile $APP_HOME/
 ---> f7ff0ee82ba2
Step 19/22 : COPY Gemfile.lock $APP_HOME/
 ---> c963b4c4617f
Step 20/22 : RUN bundle install
 ---> Running in 3d2cdf999972

Aside node : It is working perfectly on my local machine.

Looking at the Docker documentation Leverage build cache doesn't seem to explain the behaviour here as neither the Dockerfile, nor the Gemfile has changed, so the cache should be used.

What could prevent Docker from using the cache for the Gemfile?

Update

I tried to copy the files setting the right permissions using COPY --chown=user:group source dest but it still doesn't use the cache.

Opened Docker forum topic: https://forums.docker.com/t/docker-build-not-using-cache-when-copying-gemfile-while-using-cache-from/69186

V-R
  • 1,309
  • 16
  • 32
ZedTuX
  • 2,859
  • 3
  • 28
  • 58
  • 2
    I'm having exactly the same issue, do you have any update on it? – Marcelo Apr 16 '19 at 18:04
  • Sorry, no I don't @Marcelo, I had to switch to another task, this issue is in a pending state on my side. I will come back on this but I don't know when yet. – ZedTuX Apr 17 '19 at 04:51
  • 1
    Thanks, I'm still investigating, if I have any news I let you know her – Marcelo Apr 17 '19 at 07:09
  • 1
    in my case, I was trying to build an image in Travis that would cache from another image that was build in Dockerhub (and then i got the same problem as you). I'm not 100% sure if it's the issue yet, but probably Travis is not caching from Dockerhub image because they are using different docker versions (that produce different hashs for the same content). The fix for me was to build and push everything in Travis and just use docker as image registry, since probably Dockerhub is using an old docker version to generate the images. – Marcelo Apr 23 '19 at 10:35
  • @Marcelo I got the exact issue and I confirm DockerHub automatic builds has something that makes `COPY` always getting cache busted. Very annoying and I didn't find any reference to it. – Andre Miras Oct 21 '19 at 12:48

4 Answers4

39

Let me share with you some information that helped me to fix some issues with Docker build and --cache-from, while optimizing a CI build.

I had struggled for several days because I didn't have the correct understanding, I was basing myself on incorrect explanations found on the webs.

So I'm sharing this here hoping it will be useful to you.

Using --cache-from is exclusive: the local Docker cache won't be used

This means that it doesn't add new caching sources, the image tags you provide will be the only caching source for the Docker build.

Even if you just built the same image locally, the next time you run docker build for it, in order to benefit from the cache, you need to either:

  1. provide the correct tag(s) with --cache-from (and with the correct precedence); or

  2. not use --cache-from at all (so that it will use the local build cache)

When providing multiple --cache-from, the order matters

The order is very important, because at the first match, Docker will stop looking for other matches and it will use that one for all the rest of the commands.

This is explained by the person who implemented the feature in the Github PR:

When using multiple --cache-from they are checked for a cache hit in the order that user specified. If one of the images produces a cache hit for a command only that image is used for the rest of the build.

There is also a lenghtier explanation in the initial ticket proposal:

Specifying multiple --cache-from images is bit problematic. If both images match there is no way(without doing multiple passes) to figure out what image to use. So we pick the first one(let user control the priority) but that may not be the longest chain we could have matched in the end. If we allow matching against one image for some commands and later switch to a different image that had a longer chain we risk in leaking some information between images as we only validate history and layers for cache. Currently I left it so that if we get a match we only use this target image for rest of the commands.

If the parent image changes, the cache will be invalidated

For example, if you have an image based on docker:stable, and docker:stable gets updated, the cached builds of your image will not be valid anymore as the layers of the base image were changed.

This is why, if you're configuring a CI build, it can be useful to docker pull the base image as well and include it in the --cache-from, as mentioned in this comment in yet another Github discussion.

Elias Dorneles
  • 22,556
  • 11
  • 85
  • 107
5

I struggled with this problem, and in my case I used COPY when the checksum might have changed (but only technically, the content was functionally identical). So, I worked around this way:

Dockerfile:

ARG builder_image=base-builder

# Compilation/build stage
FROM golang:1.16 AS base-builder
RUN echo "build the app" > /go/app

# This step is required to facilitate docker cache. With the definition of a `builder_image` build tag
# we can essentially skip the build stage and use a prebuilt-image directly.
FROM $builder_image AS builder

# myapp docker image
FROM ubuntu:20.04 AS myapp

COPY --from=builder /go/app /opt/my-app/bin/

Then, I can run the following:

# build cache
DOCKER_BUILDKIT=1 docker build --target base-builder -t myapp-builder .
docker push myapp-builder

# use cache
DOCKER_BUILDKIT=1 docker build --target myapp --build-arg=builder_image=myapp-builder -t myapp .
docker push myapp

This way we can force Docker to use a prebuilt image as a cache.

Denis V
  • 3,290
  • 1
  • 28
  • 40
  • This is the only way for me to use prebuilt image at another build-host(I gave at using --cache-from approach). Thank you so much! – Fumisky Wells May 24 '21 at 02:28
2

For whoever is fighting with DockerHub automated builds and --cache-from. I realized images built from DockerHub would always lead to cache bust on COPY commands when pulled and used as build cache source. It seems to be also the case for @Marcelo (refs his comment).

I investigated by creating a very simple image doing a couple of RUN commands and later COPY. Everything is using the cache except the COPY. Even though content and permissions of the file being copied is the same on both the pulled image and the one built locally (verified via sha1sum and ls -l).

The solution for me was to publish the image to the registry from the CI (Travis in my case) rather than letting DockerHub automated build doing it. Let me emphasis here that I'm talking here about a specific case where files are definitely the same and should not cache bust, but you're using DockerHub automated builds.

I'm not sure why is that, but I know for instance old docker-engine version e.g. prior 1.8.0 didn't ignore file timestamp to decide whether to use the cache or not, refs https://docs.docker.com/release-notes/docker-engine/#180-2015-08-11 and https://github.com/moby/moby/pull/12031.

Andre Miras
  • 3,580
  • 44
  • 47
1

For a COPY command to be cached, the checksum needs to be identical on the source being copied. You can compare the checksum in the docker history output between the cache image and the one you just built. Most importantly, the checksum includes metadata like the file owner and file permission, in addition to file contents. Whitespace changes inside a file like changing to linefeeds between Linux and Windows styles will also affect this. If you download the code from a repo, it's likely the metadata, like the owner, will be different from the cached value.

BMitch
  • 231,797
  • 42
  • 475
  • 450
  • 2
    Thank you for your comment, I'll look at that. Aside note, you're wrong about the modification time being included in the checksum as the doc says "The last-modified and last-accessed times of the file(s) are not considered in these checksums." (See my "Leverage build cache" link ;)) – ZedTuX Feb 15 '19 at 10:04
  • @ZedTuX indeed, looks like build does ignore that timestamp change for the build cache. I was thinking of the layer diff where it looks at what files to include in a new layer. – BMitch Feb 15 '19 at 13:09