0

We have a single stage dockerfile building a single image.

Due to many logical considerations, we want to separate this build to many different stages (e.g. production, development, testing etc..). We plan to use BuildKit enabled for all its benefits.

My question is: Is there a way to know in advance which targets need rebuilding when the dockerfile changes? We really don't want to rebuild all images every time the dockerfile changed. That would be a huge waste of time for the end user, especially since there is a good chance the change is not applicable to that specific end user.

Separating the dockerfile to different files is not an option. There are many shared components between the different stages, some take a lot of storage. We're relying on docker layering to save up the storage space by reusing those layers between stages.

Here's a simplified example of what the dockerfile will look like. I don't want to rebuild the "testing" stage if I only changed the toolchain, or added some package to the "developer" stage. Also, toolchain stage will be shared between many stages so can't just embed it inside "developer" stage.

FROM SomeImage as toolchain

RUN wget https://path.to/compiler.tar.gz
RUN tar zxfv compiler.tar.gz /path/to/compiler
RUN rm compiler.tar.gz

FROM SomeImage as deployment

# Base packages
RUN apt-get install -y openssl which python3 python3-pip ... 

CMD /bin/bash

FROM deployment as testing

RUN wget https://path.to/testing.suite.tar.gz
RUN tar zxfv testing.suite.tar.gz testing/
RUN cd testing && ./install_testing_suite
RUN rm -rf testing.tar.gz testing/

FROM deployment as developer
COPY --from=toolchain /path/to/compiler /path/to/compiler

.....

Note: We use docker 19.3

shayst
  • 267
  • 4
  • 14
  • 1
    docker is smart enough to skip rebuilding image layers if they have not changes. It can detect what has change and only build what has changed. So, your statement that rebuilding takes huge amount of time is not correct. – Dmitry Apr 11 '22 at 11:48
  • also, if you really want to keep all things in one file and control what is to be built. Convert it to a template and generate desired final Dockerfile based on some conditions. – Dmitry Apr 11 '22 at 11:53
  • @Dmitry you're right, of course, I forgot to mention that the overhead comes from the internal process that rebuild the images and not directly the 'docker build' command. Regarding a docker template, I'm not entirely sure how that will handle the caching of different shared stages. I tried looking for more information about it, but couldn't find any. If you can point me to one, it would be great – shayst Apr 11 '22 at 12:00
  • docker build -t . : it will build whatever is in your Dockerfile. by using any kind of template engines you can include or exclude parts of your Dockerfile. If you push your images to the docker registry, any other stages should be always able to pick up from there the versions you want to use, whatever they are. However, i am not sure what you meant by caching exactly. – Dmitry Apr 11 '22 at 16:52
  • The design I'm going with is kind of a micro services concept where each component is its own layer and the final image will collect some of those components to create an endpoint image to be used by the client. Some of those components are heavy in storage and I want to use a single cache instance for all endpoint images. I've tried to look at templates, but I still can't see how it solves the underlying issue. Even if I template things, It's still non trivial to know in advance which of the endpoint images need rebuilding. Maybe I'm missing something? – shayst Apr 12 '22 at 08:47

0 Answers0