6

I'm running a service on a Swarm cluster, thanks to docker stack deploy --with-registry-auth and this compose file:

version: "3.1"
services:
  builder-consumer:
    image: us.gcr.io/my-gcloud-project/my/image:123
    stop_grace_period: 30m
    volumes:
      - [...]
    environment:
      - [...]
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == worker
    secrets:
      - [...]
secrets:
  [...]

This works fine when I deploy, but when I add a worker node to the swarm later on, the new worker can't pull the image required to run the task. The system logs report this:

level=error msg="Not continuing with pull after error: denied: Permission denied for \123\" from request \"/v2/my-gcloud-project/my/image/manifests/123\". "

level=info msg="Translating \"denied: Permission denied for \\"123\\" from request \\"/v2/my-gcloud-project/my/image/manifests/123\\". \" to \"repository us.gcr.io/my-gcloud-project/my/image not found: does not exist or no pull access\""

level=error msg="pulling image failed" error="repository us.gcr.io/my-gcloud-project/my/image not found: does not exist or no pull access" module="node/agent/taskmanager" node.id=... service.id=... task.id=...

level=error msg="fatal task error" error="No such image: us.gcr.io/my-gcloud-project/my/image:123@sha256:..." module="node/agent/taskmanager" node.id=... service.id=... task.id=...

However, when I manually run docker pull on that machine, it works fine, since every machine in the cluster is authenticated to my private Google Registry, thanks to docker login.

Thus my questions are:

  • Why can't the added worker pull from the private registry?
  • What does --with-registry-auth do exactly?

Thanks a lot

Note: the nodes are running Ubuntu 16.04.2 LTS and the Docker version is:

Server:
 Version:      17.04.0-ce
 API version:  1.28 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   4845c56
 Built:        Mon Apr  3 18:07:42 2017
 OS/Arch:      linux/amd64
 Experimental: false
Faizan
  • 1,937
  • 13
  • 18
BOUGA
  • 61
  • 1
  • 2
  • It looks like you've encountered a swarm mode issue. Have you checked their issues on GitHub and/or raised your own issue? – BMitch Sep 06 '17 at 00:39
  • @bouga - did you solve this somehow? I have the same problem and looking for some workaround. – Izydorr Feb 06 '19 at 14:21
  • @Izydorr Nope, we simply moved away from swarm and use Kubernetes instead – BOUGA Feb 07 '19 at 21:52

1 Answers1

2

In my case I was not running the stack with "--with-registry-auth", so I shuted down the instances, and I started again the manager with that option, and now it works

Cris R
  • 1,339
  • 15
  • 27
  • As I said, I also deploy with `--with-registry-auth` and it works at first, but when I add a VM to the swarm (by joining as a worker), it fails to pull the services images due to a "Permission denied". – BOUGA Jun 01 '17 at 08:50
  • what about the TTL for ```CI_JOB_TOKEN```? – Richard Apr 03 '18 at 02:35