0

I have a Gitlab-Runner (version: 14.4.0) in a VM (Ubuntu). The docker version is 20.10.10. Everything was working as expected.

Then I wanted to delete the installed images in the folder "/var/lib/docker/vfs". I have done the following steps.

systemctl stop docker

cd /usr/share/gitlab-runner
./clear-docker-cache prune

docker system prune -f --all

ls -la /var/lib/docker/vfs/dir/
# returns an empty dir which is what I want

systemctl daemon-reload
systemctl start docker

systemctl stop gitlab-runner
systemctl start gitlab-runner

After that I tried to start a new build job using this gitlab-runner. Unfortunately, the Gitlab runner continues to reference the images I`ve deleted.

The following error messages occur when I want to build something with the runner.

Using Docker executor with image my-alpine:0.1.6 ...
ERROR: Preparation failed: adding cache volume: set volume permissions: create permission container for volume "runner-o19hepv1-project-133520-concurrent-0-cache-3c3f060a0374fc8bc39395164f415a70": Error response from daemon: 48ac0f992674b920004317b8b6fc91dbc72f01327ca96005f7b19693f3c128ca: stat /var/lib/docker/vfs/dir/48ac0f992674b920004317b8b6fc91dbc72f01327ca96005f7b19693f3c128ca: no such file or directory (linux_set.go:95:0s)

How do I get rid of these error messages? What did I do wrong with my approach. In principle, I would also like the images to be deleted once a week later.

The gitlab-runner systemd service is started with

/usr/bin/gitlab-runner "run" "--working-directory" "/home/gitlab-runner" "--config" "/etc/gitlab-runner/config.toml" "--service" "gitlab-runner" "--user" "gitlab-runner"

and the configuration (config.toml) is

concurrent = 5
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "my-gitlabrunner"
  url = "https://git.tech.rz.db.de/"
  token = "mytoken"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "alpine"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
user5580578
  • 1,134
  • 1
  • 12
  • 28

1 Answers1

1

I had a similar problem and, in my case, this was caused when the runner tried to crate a "permissions container" using a faulty image. Deleting that image so that it would re-download sorted it for me, the image was called registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-8925d9a0

$ docker image rm  registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-8925d9a0
Error response from daemon: exit status 1: "/usr/bin/zfs fs destroy -r system/docker/418e78d27d51c2e2628534aaf9f84c5d76748d62e548a4de356328e0fb3a0c31" => cannot open 'system/docker/418e78d27d51c2e2628534aaf9f84c5d76748d62e548a4de356328e0fb3a0c31': dataset does not exist

Despite the error message the image was deleted. When I then retried a CI job it was downloaded again and everything has worked fine since.

starfry
  • 9,273
  • 7
  • 66
  • 96