Docker swarm can't resolve service name on other nodes

Question

I had a small working docker swarm on google cloud platform. There are just two nodes, one with nginx and php, the other one with mysql.

Right now it seems that from master node I can't connect to the mysql on the worker node.

SQLSTATE[HY000] [2002] php_network_getaddresses: getaddrinfo failed: Name or service not known

Same problem also with ping from a shell inside the container.

I've used --advertise-addr flag when init the swarm:

docker swarm init --advertise-addr 10.156.0.3

Then I've successfully join the swarm from the 2nd node:

docker swarm join --token my-token 10.156.0.3:2377

Also the deploy is successful

docker stack deploy --compose-file docker-compose.yml test

Creating network test_default
Creating service test_mysql
Creating service test_web
Creating service test_app

(in docker-compose.yml there is no network definition, I'm using the docker default)

Nodes:

ID                            HOSTNAME       STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
oz1ebgrp1a68brxi0nd1gdr2k     mysql-001      Ready               Active                                  18.03.1-ce
ndy11zyxi0wym8mjmgh8op1ni *   app-001        Ready               Active              Leader              18.03.1-ce

docker stack ps test

ID                  NAME                          IMAGE                                 NODE        DESIRED STATE       CURRENT STATE           ERROR      PORTS 
9afwjgtpy8lc        test_app.1                  127.0.0.1:5000/app:latest             app-001       Running             Running 8 minutes ago     
mgajupmcai0t        test_web.1                  127.0.0.1:5000/web:latest             app-001       Running             Running 8 minutes ago           
s17jvkukahl7        test_mysql.1                mysql:5.7                             mysql-001     Running             Running 8 minutes ago

docker networks:

NETWORK ID          NAME                DRIVER              SCOPE
9084b39892f4        bridge              bridge              local
ofqtewx039fl        test_default        overlay             swarm
5cc9d4554bea        docker_gwbridge     bridge              local
97fbd06a23b5        host                host                local
x8f408klk2ms        ingress             overlay             swarm
ca1b849ea73a        none                null                local

Here is my docker info

Containers: 12
 Running: 3
 Paused: 0
 Stopped: 9
Images: 35
Server Version: 18.03.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: ndy11zyxi0wym8mjmgh8op1ni
 Is Manager: true
 ClusterID: q23l1v6dav3u4anqqu51nwx0r
 Managers: 1
 Nodes: 2
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
Total Memory: 14.09GiB
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 10.156.0.3
 Manager Addresses:
  10.156.0.3:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.13.0-1019-gcp
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 14.09GiB
Name: app-001
ID: IWKK:NWRJ:HKAQ:3JSQ:7H3L:2WXC:IIJ7:OEKB:4ARR:T7FY:VAWR:HOPL
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

This swarm was working fine few weeks ago. I didn't need this application for few weeks so I've turned off all the machines. Meanwhile swarm-node.crt expired and so today when I've turned on the machine I had to remove the service and the swarm and recreate it from scratch. The result is that I can't connect from container on one node to container on the other node.

Any help will be appreciated.

UPDATE:

here is docker-compose.yml

version: '3'
services:
  web:
    image: 127.0.0.1:5000/web
    build:
      context: ./web
    volumes:
      - ./test:/var/www
    build:
    ports:
      - 80:80
    links:
      - app
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.hostname == app-001
  app:
    image: 127.0.0.1:5000/app
    build:
      context: ./app
    volumes:
      - ./test:/var/www
    depends_on:
      - mysql
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.hostname == app-001
  mysql:
    image: mysql:5.7
    volumes:
      - /mnt/disks/ssd-001/mysql:/var/lib/mysql
      - /mnt/disks/buckets/common-storage-001/backup/mysql:/backup
    environment:
      - "MYSQL_DATABASE=test"
      - "MYSQL_USER=test"
      - "MYSQL_PASSWORD=*****"
      - "MYSQL_ROOT_PASSWORD=*****"
    command: mysqld --key-buffer-size=32M --max-allowed-packet=16M --myisam-recover-options=FORCE,BACKUP --tmp-table-size=32M --query-cache-type=0 --query-cache-size=0 --max-heap-table-size=32M --max-connections=500 --thread-cache-size=50 --innodb-flush-method=O_DIRECT --innodb-log-file-size=512M --innodb-buffer-pool-size=16G --open-files-limit=65535
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.hostname == mysql-001

Today, after a restart of the two machines it start working again. — user1763784, Jun 18 '18 at 09:12

Docker swarm can't resolve service name on other nodes

0 Answers0