Images in docker are referred to by a reference, the most common being an image repository and tag. And that tag is a relative free formed string that points to a specific image. Tags are best thought of as a mutable pointer, it can be changed, you can have multiple pointers pointing to the same image, and it can be deleted while the underlying image may remain intact.
Since the docker does not enforce much structure on the tags (other than verifying it contains valid characters and does not exceed a length limit), enforcing this is an exercise left up to each repository maintainer, and many different solutions have resulted.
For repository maintainers, here are a few common implementations:
Option A: Ideally, repository maintainers follow some form of semver. This version number should map to the version of the packaged software, often with an additional patch number for the image revision. Importantly, images tagged this way should include tags not just for version 1.2.3-1, but also 1.2.3, 1.2, and 1, each of which are updated to the latest release within their respective hierarchy. This allows downstream users to depend on 1.2 and automatically get the updates for 1.2.4, 1.2.5, etc, as bug fixes and security updates come out.
Option B: Similar to the semver option above, many projects include other important metadata with their tags, e.g. which architecture, or base image, was used for that build. This is commonly seen with alpine vs debian/slim images, or arm vs amd compiled code. These will often be combined with semver, so you may see tags like alpine-1.5
, in addition to alpine-1
and alpine
tags.
Option C: Some projects follow more of a rolling release that offer no backward compatibility promises. This is often done with build numbers or a date string, and indeed Docker itself uses this, though with a process to deprecate features and avoid breaking changes. I've seen quite a few internal projects with companies use this strategy to version their images, relying on build number from a CI server.
Option D: I'm less of a fan of putting Git revision hashes as image tags since these convey no details without referring back to the Git repository. Not every user may have this access or skill to understand this reference. And by looking at two different hashes, I have no idea of which is newer or compatible with my application without an external check. They also assume the sole important version number is from Git, and ignore that the same Git revision may be used to create multiple images, from different parent images, different architectures, or just multiple Dockerfiles/multistage targets within the same Git repo. Instead, I like using label schema, and eventually the image spec annotations once we get tooling around image annotations, to track details like Git revisions. This places the Git revision into metadata that you can query to verify an image, while still leaving the tag itself to be user informative.
For image users, if you have a requirement to avoid unexpected changes from upstream, there are two options I know of.
The first is to run your own registry server, and pull your external dependencies to a local server. Docker includes an image for a standalone registry that you can install, and the API is open which has allowed many artifact repository vendors to support the docker registry. Do take care to regularly update this registry, and include a way to go back to previous versions if an update breaks your environment.
The second option is to stop depending on mutable tags. Instead, you can use image pinning which refers to the registry's sha256 unique reference to the manifest that cannot be changed. You can find this value in the RepoDigests when you inspect an image pulled from a registry server:
$ docker inspect -f '{{json .RepoDigests}}' debian:latest
["debian@sha256:de3eac83cd481c04c5d6c7344cd7327625a1d8b2540e82a8231b5675cef0ae5f"]
$ docker run -it --rm debian@sha256:de3eac83cd481c04c5d6c7344cd7327625a1d8b2540e82a8231b5675cef0ae5f /bin/bash
root@ac9db398dc03:/#
The biggest risk from binding to a specific image like this is missing security updates and important bug fixes. If you take this option, make sure to have a procedure to regularly update these images.
Regardless of which solution you follow for pulling images, using latest is only useful for a quick developer test, not for any production use cases. The behavior of latest entirely depends on the repository maintainer, some always update it to the last release, some make it the last stable release, and some forget to update it at all. If you depend on latest, you'll likely experience an outage when upstream images change from a version like 1.5 to 2.0, with backwards-incompatible changes. Your next deploy will inadvertently include these changes unless you explicitly depend on a tag that offers the promise of bug fixes and security patches without breaking changes.