2

I'm executing CI jobs with gitlab-ci runner which is configured with kubernetes executor, and actually runs on openshift. I want to be able to build docker images to dockerfiles, with the following constraints:

  1. The runner (openshift pod) is ran as user with high and random uid (234131111111 for example).
  2. The runner pod is not privileged.
  3. Not having cluster admin permissions, or ability to reconfigure the runner.

So obviously DinD cannot work, since is requires special docker device configuration. Podman, kaniko, buildah, buildkit and makisu don't work for random non-root user and without any volume. Any suggestions?

Bernard Hauzeur
  • 2,317
  • 1
  • 18
  • 25
Hadas
  • 61
  • 2
  • Docker was initially not in your question title but well in your tags and text; likely, you were so sure it would never work... so I edited the title for coherence and posted an answer re docker. I haven't yet practised other image builders... but that may be easier, because some like buildah are daemonless... and can digest native Dockerfile's – Bernard Hauzeur Feb 05 '23 at 09:34
  • For a good intro to many means to build compliant OCI images without docker and deploy them in kubernetes clusters, there's an eBOOK: https://developers.redhat.com/e-books/gitops-cookbook (registration required, free account at RedHat Developers is avail) – Bernard Hauzeur Feb 05 '23 at 09:38

1 Answers1

2

DinD (Docker-in-Docker) does work in OpenShift 4 gitlab runners... just made it, and it was... a fight! Fact is, the solution is extremely brittle to any change of a version elsewhere. I just tried e.g. to swap docker:20.10.16 for docker:latest or docker:stable, and that breaks.

Here is the config I use inside which it does work:

  1. OpenShift 4.12
  2. the RedHat certified GitLab Runner Operator installed via the OpenShift Cluster web console / OperatorHub; it features gitlab-runner v 14.2.0
  3. docker:20.10.16 & docker:20.10.16-dind

Reference docs:

  1. GitLab Runner Operator installation guide: https://cloud.redhat.com/blog/installing-the-gitlab-runner-the-openshift-way
  2. Runner configuration details: https://docs.gitlab.com/runner/install/operator.html and https://docs.gitlab.com/runner/configuration/configuring_runner_operator.html
  3. and this key one about matching pipeline and runner settings: https://docs.gitlab.com/ee/ci/docker/using_docker_build.html which is actually the one to follow very precisely for your settings in gitlab .gitlab-ci.yml pipeline definitions AND runner configuration config.toml file.

Installation steps:

  1. follow docs 1 and 2 in reference above for the installation of the Gitlab Runner Operator in OpenShift, but do not instantiate yet a Runner from the operator

  2. on your gitlab server, copy the runner registration token for a group-wide or project-wide runner registration

  3. elswhere in a terminal session where the oc CLI is installed, login to the openshift cluster via the 'oc' CLI such as to have cluster:admin or system:admin role

  4. create a OpenShift secret like:

    vi gitlab-runner-secret.yml

    apiVersion: v1
    kind: Secret
    metadata:
      name: gitlab-runner-secret
      namespace: openshift-operators
    type: Opaque
    stringData:
      runner-registration-token: myRegistrationTokenHere
    

    oc apply -f gitlab-runner-secret.yml

  5. create a Custom configuration map; note that OpenShift operator will merge the supplied content to that of the config.toml generated by the gitlab runner operator itself; therefore, we only provide the fields we want to complement (we cannot even override an existing field value). Note too that the executor is preset to "kubernetes" by the OC Operator. For the detailed understanding, see docs hereabove.

    vi gitlab-runner-config-map.toml

        [[runners]]  
          [runners.kubernetes]
          host = ""
          tls_verify = false
          image = "alpine"
          privileged = true
          [[runners.kubernetes.volumes.empty_dir]]
            name = "docker-certs"
            mount_path = "/certs/client"
            medium = "Memory"
    

    oc create configmap gitlab-runner-config-map --from-file config.toml=gitlab-runner-config-map.toml

  6. create a Runner to be deployed by the operator (adjust the url)

    ​vi gitlab-runner.yml

        apiVersion: apps.gitlab.com/v1beta2
        kind: Runner
        metadata:
          name: gitlab-runner
          namespace: openshift-operators
        spec:
          gitlabUrl: https://gitlab.example.com/
          buildImage: alpine
          token: gitlab-runner-secret
          tags: openshift, docker
          config: gitlab-runner-config-map
    

    oc apply -f gitlab-runner.yml

  7. you shall then see the runner just created via the openshift console (installed operators > gitlab runner > gitlab runner tab), followed by the outomatic creation of a PoD (see workloads). You may even enter a terminal session on the PoD and type for instance: gitlab-runner list to see the location of the config.toml file. You shall also see on the gitlab repo server console the runner being listed at the group or project level. Of course, firewalls in between your OC cluster and your gitlab server may ruin your endeavors at this point...

  8. the rest of the trick takes place in your .gitlab-ci.yml file, e.g. (extract only showing one job at some stage). For the detailed understanding, see doc Nb 3 hereabove. the variable MY_ARTEFACT is pointing to a sub-dirctory in the relevant git project/repo in which a Dockerfile is contained that you have already successfully executed in your IDE for instance; and REPO_PATH holds a common prefix string including a docker Hub repository path and some extra name piece. You adjust all that to your convenience, BUT don't edit any of the first 4 variables defined under this job and do not change the docker[dind] version; it would break everything.

        my_job_name:
          stage: my_stage_name
          tags:
            - openshift # to run on specific runner
            - docker
          image: docker:20.10.16
          variables:
            DOCKER_HOST: tcp://docker:2376
            DOCKER_TLS_CERTDIR: "/certs"
            DOCKER_TLS_VERIFY: 1
            DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
            REPO_TAG: ${REPO_PATH}-${MY_ARTEFACT}:${IMAGE_TAG}
          services:
            - docker:20.10.16-dind
          before_script:
            - sleep 10 && docker info #give time for starting the service and confirm good setup in logs
            - echo $DOKER_HUB_PWD | docker login -u $DOKER_HUB_USER --password-stdin
          script:
            - docker build --network host -t $REPO_TAG ./$MY_ARTEFACT
            - docker push $REPO_TAG
    

There you are, trigger the gitlab pipeline...

If you miss-configured anything, you'll get the usual error message "is the docker daemon running?" after a claim regarding failing access to "/var/run/docker.sock" or failing connection to "tcp://localhost:2375". And no-no! port 2376 is not a typo but the exact value to use at step 8 hereabove.

So far so good? ... not yet!

Security settings:

Well, you may now see your docker builds starting (meanin D-in-D is OK), and then failing for security sake (or locked up). Although we set 'privileged=true' at step 5:

Docker comes with a nasty yet easy (and built-in) feature: it runs by default as 'root' in every container it builds, and for building containers. on the other hand, OpenShift is built with strict security in mind, and would prevent any pod to run as root.

So we have to change security settings to enable those runners to execute in privileged mode, reason why it is important to restrict these permissions to a namespace, here 'openshift-operators' and the specific account 'gitlab-runner-sa'.

`oc adm policy add-scc-to-user privileged -z gitlab-runner-sa -n openshift-operators`

The above will create a RoleBinding that you may remove or change as required. Fact is, 'gitlab-runner-sa' is the service account used by the Gitlab Runner Operator to instantiate runner pod's, and '-z' indicates to target the permission settings to a service account (not a regular user account). '-n' references the specific namespace we use here.

So you can now build images.... but may still be defeated when importing those images into an OpenShift project and trying to execute the generated pod's. There are two contraints to anticipate:

  1. OpenShift will block any image that requires to run as 'root', i.e. in privileged mode (the default in docker run and docker compose up). ==> SO, PLEASE ENSURE THAT ALL THE IMAGES YOU WILL BUILD WITH DOCKER-in-DOCKER can run as a non root user with the dockerfile directive USER <uid>:<gid> !

  2. ... but the above may not be sufficient! indeed, by default, OpenShift generates a random user ID to launch the container and ignores the one set in docker build as USER <uid>:<gid>. To effectively allow the container to switch to the defined user you have to bind the service account that runs your pods to the "anyuid" Security Context Constraint. This is easy to achieve via a role binding, else the command in oc CLI:

    oc adm policy add-scc-to-user anyuid -n myProjectName -z default

    where -z denotes a service account into the -n namespace.

Bernard Hauzeur
  • 2,317
  • 1
  • 18
  • 25
  • if you encounter hanging RUN docker build .... that can be solved by adding the --network host option as now illustrated in the sample .gitlab-ci.yml extract above – Bernard Hauzeur Mar 10 '23 at 10:35