0

I am using gitlab-runner version 14.4.0 and docker version 20.10.11 on Ubuntu 18.04.6 LTS The machine I am using for the runners is a powerful Supermicro server. Our Gitlab CI is on gitlab cloud (SAAS)

I have been receiving the following errors on Build stage jobs:

  1. ERROR: Job failed (system failure): Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (exec.go:66:120s)
  2. Error: Job failed (system failure): Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:708:120s)
  3. Preparation failed: adding cache volume: set volume permissions: create permission container for volume "runner-######-project-#####-concurrent-0-cache-##############": Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (linux_set.go:90:120s)
  4. ERROR: Job failed (system failure): prepare environment: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? (docker.go:708:120s). Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

The solutions I have tried so far:

  1. Added multi pull policy: pull_policy = ["always", "if-not-present"] in config.toml for all runners
  2. Gave permission to gitlab-runner user for docker and sudo groups
  3. Tried chmod 666 /var/run/docker.sock
  4. systemctl docker enable & systemctl docker start
  5. restarted gitlab-runner and reloaded daemon
  6. Fresh installed the machine from scratch with Ubuntu 18.04.6 LTS, latest docker and gitlab-runner

Nothing seemed to have solved the issue. Usually just restarting the jobs after the error gets the jobs running. But that is not a solution.

I am new to this and any help is appreciated!

Thank you

sacrificulum
  • 1
  • 1
  • 1
  • Can you please post your `config.toml` file that you're using for the GitLab runner? That will give us additional information about how you're configuring it and will let us help much more. – Patrick Nov 26 '21 at 19:45
  • concurrent = 70 check_interval = 1 [session_server] session_timeout = 1800 [[runners]] name = "runnr" url = "https://gitlab.com/" token = "#####" executor = "docker" [runners.custom_build_dir] [runners.cache] [runners.cache.s3] [runners.cache.gcs] [runners.cache.azure] [runners.docker] tls_verify = false image = "Ubuntu:18.04" privileged = false pull_policy = ["always", "if-not-present"] disable_entrypoint_overwrite = false oom_kill_disable = false disable_cache = false volumes = ["/cache"] shm_size = 0 – sacrificulum Nov 29 '21 at 08:17
  • Hello @Patrick thanks for your response please see above my config.toml – sacrificulum Nov 29 '21 at 08:19

1 Answers1

1

The issue you're running into is that you're attempting to use the docker socket to build a container without actually exposing the docker socket inside your executor. You have three options for how to solve this issue:

  1. Map the docker socket into the runner. To do this, where you're specifying volumes, add /var/run/docker.sock:/var/run/docker.sock to the array of mapped volumes.
  2. Use docker-in-docker with a privileged container, which doesn't require you to map the docker socket, but requires you to be familiar with how DIND works and to follow the instructions here: https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#use-the-docker-executor-with-the-docker-image-docker-in-docker
  3. Use something to build your container that doesn't require a docker socket. I'd highly recommend using Kaniko to build your docker container - it tends to be faster than docker and you can ignore the use of the docker socket altogether (which makes your builds more secure too): https://docs.gitlab.com/ee/ci/docker/using_kaniko.html#building-a-docker-image-with-kaniko
Patrick
  • 2,885
  • 1
  • 14
  • 20
  • Thank you so much for your answer. I have tried solution 1 and also gave permission to gitlab-runner user to docker.sock using this command: chown gitlab-runner:docker /var/run/docker.sock I am still facing the same errors. Solutions 2 and 3 are not ideal for me because of my slow internet. Is there any other solution that I can try in my current environment? – sacrificulum Dec 02 '21 at 10:43
  • using chown on the socket likely won't do anything due to how the docker daemon itself runs, but if it was exposed properly you would get a different error (permission denied), so it's still not exposed properly. Your internet itself won't have any impact on options 2 or 3 (the only thing pulled from the internet would be the container built itself, which you'll have to do even if you're exposing the socket), so if you're struggling with the socket I'd suggest those. Option 3 is by far the best practice anyway. – Patrick Dec 02 '21 at 23:23