TL;DR
About the @thaJeztah's answer, I've tried it. And the --no-cache
does not only force the current stage to rebuild, but also forces any dependant stage to be completely rebuilt. And this is not what we want.
But there's a way to force an invalidation only to specific stages: Use a named ARG
and do not use it in the Dockerfile
. And pass it as a --build-arg
to docker build
.
This creates a "different layer" and therefore invalidates anything behind.
Rationale
This is an excerpt of my final dockerfile with 4 stages:
repo-sources-base
=> The base to download my libraries - don't rebuild
repo-sources
=> My libraries - rebuild
base
=> Ubuntu with my apache, php, etc. - don't rebuild
production
=> The base with my project and my libraries that goes to the production server - rebuild
see here:
#===========================================================================#
# Stage `repo-sources-base` #
# ------------------------- #
# This image installs git so we don't have to install git it each time we #
# rebuild the repo-sources. #
#===========================================================================#
FROM ubuntu:20.04 AS repo-sources-base
# Install git.
RUN \
apt-get update && \
apt-get install -y git && \
:
# Scan the gitlab host key.
RUN \
touch /root/.ssh/known_hosts && \
ssh-keyscan gitlab.com >> /root/.ssh/known_hosts && \
:
#===========================================================================#
# Stage `repo-sources` #
# -------------------- #
# This image contains the SSH private keys to make the clone of the #
# source code from gitlab. This key is passed via ARG to avoid hardcoding #
# it here inside. #
#===========================================================================#
FROM repo-sources-base AS repo-sources
# NOTE THIS ARG!!!!!!!
# Not used anywhere... just declared.
# But invalidates the docker build cache on purpose!
# See the answer text for explanation.
ARG INVALIDATE_CACHE_TIMESTAMP="0000-00-00T00:00:00.000000Z"
# Inspired here https://vsupalov.com/build-docker-image-clone-private-repo-ssh-key/
# Add credentials.
ARG SSH_PRIVATE_KEY
RUN \
mkdir /root/.ssh && \
echo "${SSH_PRIVATE_KEY}" > /root/.ssh/id_rsa && \
chmod 600 /root/.ssh/id_rsa && \
:
# Clone the needed repos.
RUN mkdir -p whatever-path/repos
WORKDIR /whatever-path/repos
RUN git clone --quiet git@gitlab.com:my-nice-account/my-nice-project-1.git
RUN git clone --quiet git@gitlab.com:my-nice-account/my-nice-project-2.git maybe_deploy_dir_2
RUN git clone --quiet git@gitlab.com:my-nice-account/my-nice-project-3.git
RUN git clone --quiet git@gitlab.com:my-nice-account/my-nice-project-4.git
#===========================================================================#
# Stage `base` #
# ------------ #
# This image contains the base operating system to build the production #
# release on top of it. It is expected to mutate very slowly so we can have #
# the layers pre-cached when building. #
# TODO: Separate this in 2 bases: one for production and the other for #
# development or testing, like in here: #
# https://www.docker.com/blog/advanced-dockerfiles-faster-builds-and-smaller-images-using-buildkit-and-multistage-builds/ #
#===========================================================================#
FROM ubuntu:20.04 AS base
# Install apache, php, yarn or whatever "base production server"
# BUT NOT your source code, just "the base"
#===========================================================================#
# Stage `release` #
# --------------- #
# This image will be the one released to pre or prod and configured at #
# runtime via env-vars like backing-services, databases, etc. #
#===========================================================================#
FROM base AS release
# Copy the project files
COPY . /whatever-maybe-other-path/repos/app
# Copy the dependency files
COPY --from=repo-sources /whatever-path/repos /whatever-maybe-other-path/repos
# Continue with the "fine-tuning" after copying the source code, like static building, etc.
The 4 targets here are grouped in blocks of 2:
repo-sources-base
and repo-sources
are meant for the local downloads having SSH keys that never get either deployed not locked into an intermediate layer that goes to production.
base
and production
build the image to go to production.
The problem is what @Perseids said: Running the docker build
just ignores the cloning of our repos as it's cached, and passing --no-cache
rebuilds too much.
With this 4-target structure we build "once" repo-sources-base
and base
and we only want to rebuild repo-sources
and production
.
The key here is the ARG
named INVALIDATE_CACHE_TIMESTAMP
.
Here's the behaviour:
- If you rebuild
--target repo-sources
many times, it's loading from cache and we don't want that.
- If you rebuild
--target repo-sources --no-cache
it's forcing to also rebuilding the repo-sources-base
and we don't want that.
- If you rebuild
--target repo-sources --build-arg X=Y
it's forcing to rebuild only if X is referenced in the Dockerfile
and it has a "new value" even if it's not used later. This is why I "name" INVALIDATE_CACHE_TIMESTAMP
in the ARG
line.
So doing
docker build --target repo-sources --build-arg INVALIDATE_CACHE_TIMESTAMP=first [...]
Will build it.
Doing "again" this
docker build --target repo-sources --build-arg INVALIDATE_CACHE_TIMESTAMP=first [...]
uses the cache.
Changing the value like this
docker build --target repo-sources --build-arg INVALIDATE_CACHE_TIMESTAMP=second [...]
forces a rebuild
Now using first
or second
would use the cache but a new value would force a rebuild.
So what I do in my build script:
NOW=$(date --utc --iso-8601='ns' | sed 's/,/./' | cut -c 1-26 | sed 's/$/Z/')
docker build --target repo-sources --build-arg INVALIDATE_CACHE_TIMESTAMP=${NOW} [...]
docker build --target production [...]
So it "forces" the repo-sources
to be built again and therefore the target production
will take base
from cache and repo-sources
from the latest cached build that I enforced to rebuild.