DinD (Docker-in-Docker) does work in OpenShift 4 gitlab runners... just made it, and it was... a fight! Fact is, the solution is extremely brittle to any change of a version elsewhere. I just tried e.g. to swap docker:20.10.16 for docker:latest or docker:stable, and that breaks.
Here is the config I use inside which it does work:
- OpenShift 4.12
- the RedHat certified GitLab Runner Operator installed via the OpenShift Cluster web console / OperatorHub; it features gitlab-runner v 14.2.0
- docker:20.10.16 & docker:20.10.16-dind
Reference docs:
- GitLab Runner Operator installation guide: https://cloud.redhat.com/blog/installing-the-gitlab-runner-the-openshift-way
- Runner configuration details: https://docs.gitlab.com/runner/install/operator.html and https://docs.gitlab.com/runner/configuration/configuring_runner_operator.html
- and this key one about matching pipeline and runner settings: https://docs.gitlab.com/ee/ci/docker/using_docker_build.html which is actually the one to follow very precisely for your settings in gitlab .gitlab-ci.yml pipeline definitions AND runner configuration config.toml file.
Installation steps:
follow docs 1 and 2 in reference above for the installation of the Gitlab Runner Operator in OpenShift, but do not instantiate yet a Runner from the operator
on your gitlab server, copy the runner registration token for a group-wide or project-wide runner registration
elswhere in a terminal session where the oc CLI is installed, login to the openshift cluster via the 'oc' CLI such as to have cluster:admin or system:admin role
create a OpenShift secret like:
vi gitlab-runner-secret.yml
apiVersion: v1
kind: Secret
metadata:
name: gitlab-runner-secret
namespace: openshift-operators
type: Opaque
stringData:
runner-registration-token: myRegistrationTokenHere
oc apply -f gitlab-runner-secret.yml
create a Custom configuration map; note that OpenShift operator will merge the supplied content to that of the config.toml generated by the gitlab runner operator itself; therefore, we only provide the fields we want to complement (we cannot even override an existing field value). Note too that the executor is preset to "kubernetes" by the OC Operator. For the detailed understanding, see docs hereabove.
vi gitlab-runner-config-map.toml
[[runners]]
[runners.kubernetes]
host = ""
tls_verify = false
image = "alpine"
privileged = true
[[runners.kubernetes.volumes.empty_dir]]
name = "docker-certs"
mount_path = "/certs/client"
medium = "Memory"
oc create configmap gitlab-runner-config-map --from-file config.toml=gitlab-runner-config-map.toml
create a Runner to be deployed by the operator (adjust the url)
vi gitlab-runner.yml
apiVersion: apps.gitlab.com/v1beta2
kind: Runner
metadata:
name: gitlab-runner
namespace: openshift-operators
spec:
gitlabUrl: https://gitlab.example.com/
buildImage: alpine
token: gitlab-runner-secret
tags: openshift, docker
config: gitlab-runner-config-map
oc apply -f gitlab-runner.yml
you shall then see the runner just created via the openshift console (installed operators > gitlab runner > gitlab runner tab), followed by the outomatic creation of a PoD (see workloads). You may even enter a terminal session on the PoD and type for instance: gitlab-runner list
to see the location of the config.toml file. You shall also see on the gitlab repo server console the runner being listed at the group or project level. Of course, firewalls in between your OC cluster and your gitlab server may ruin your endeavors at this point...
the rest of the trick takes place in your .gitlab-ci.yml file, e.g. (extract only showing one job at some stage). For the detailed understanding, see doc Nb 3 hereabove. the variable MY_ARTEFACT is pointing to a sub-dirctory in the relevant git project/repo in which a Dockerfile is contained that you have already successfully executed in your IDE for instance; and REPO_PATH holds a common prefix string including a docker Hub repository path and some extra name piece. You adjust all that to your convenience, BUT don't edit any of the first 4 variables defined under this job and do not change the docker[dind] version; it would break everything.
my_job_name:
stage: my_stage_name
tags:
- openshift # to run on specific runner
- docker
image: docker:20.10.16
variables:
DOCKER_HOST: tcp://docker:2376
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_TLS_VERIFY: 1
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
REPO_TAG: ${REPO_PATH}-${MY_ARTEFACT}:${IMAGE_TAG}
services:
- docker:20.10.16-dind
before_script:
- sleep 10 && docker info #give time for starting the service and confirm good setup in logs
- echo $DOKER_HUB_PWD | docker login -u $DOKER_HUB_USER --password-stdin
script:
- docker build --network host -t $REPO_TAG ./$MY_ARTEFACT
- docker push $REPO_TAG
There you are, trigger the gitlab pipeline...
If you miss-configured anything, you'll get the usual error message "is the docker daemon running?" after a claim regarding failing access to "/var/run/docker.sock" or failing connection to "tcp://localhost:2375". And no-no! port 2376 is not a typo but the exact value to use at step 8 hereabove.
So far so good? ... not yet!
Security settings:
Well, you may now see your docker builds starting (meanin D-in-D is OK), and then failing for security sake (or locked up).
Although we set 'privileged=true' at step 5:
Docker comes with a nasty yet easy (and built-in) feature: it runs by default as 'root' in every container it builds, and for building containers.
on the other hand, OpenShift is built with strict security in mind, and would prevent any pod to run as root.
So we have to change security settings to enable those runners to execute in privileged mode, reason why it is important to restrict these permissions to a namespace, here 'openshift-operators' and the specific account 'gitlab-runner-sa'.
`oc adm policy add-scc-to-user privileged -z gitlab-runner-sa -n openshift-operators`
The above will create a RoleBinding that you may remove or change as required. Fact is, 'gitlab-runner-sa' is the service account used by the Gitlab Runner Operator to instantiate runner pod's, and '-z' indicates to target the permission settings to a service account (not a regular user account). '-n' references the specific namespace we use here.
So you can now build images.... but may still be defeated when importing those images into an OpenShift project and trying to execute the generated pod's. There are two contraints to anticipate:
OpenShift will block any image that requires to run as 'root', i.e. in privileged mode (the default in docker run and docker compose up). ==> SO, PLEASE ENSURE THAT ALL THE IMAGES YOU WILL BUILD WITH DOCKER-in-DOCKER can run as a non root user with the dockerfile directive USER <uid>:<gid>
!
... but the above may not be sufficient! indeed, by default, OpenShift generates a random user ID to launch the container and ignores the one set in docker build as USER <uid>:<gid>
. To effectively allow the container to switch to the defined user you have to bind the service account that runs your pods to the "anyuid" Security Context Constraint. This is easy to achieve via a role binding, else the command in oc CLI:
oc adm policy add-scc-to-user anyuid -n myProjectName -z default
where -z denotes a service account into the -n namespace.