8

i am trying to connect my docker services together in docker swarm.

the network is made of 2 raspberry pi's.

i can create an overlay network called test-overlay and i can see that services on either raspberry pi node can connect to the network.

my problem:

i cannot link to services between nodes with the overlay network.

given the following configuration of nodes and services, service1 can use the address http://service2 to connect to service2. but it does NOT work for http://service3. however http://service3 is accessible from service4.

node1:
  - service1
  - service2
node2:
  - service3
  - service4

i am new to docker swarm and any help is appreciated.

inspecting overlay

i have run the command sudo docker inspect network test-overlay on both nodes.

on the master node this returns the following:

[
    {
        "Name": "test-overlay",
        "Id": "skxhz8sb3f82dhh9jt9t3j5yl",
        "Created": "2018-04-15T20:31:20.629719732Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "3acb436a0cc9a4d584d537edb1546988d334afa4793cc4fae4dd6ac9b48828ea": {
                "Name": "docker-registry.1.la1myuodpkq0x5h39pqo6lt7f",
                "EndpointID": "66887fb1f5f253c6cbec149aa51ab85168903fdd2290719f26d2bcd8d6c68dc8",
                "MacAddress": "02:42:0a:00:00:04",
                "IPv4Address": "10.0.0.4/24",
                "IPv6Address": ""
            },
            "786e1fee538f81fe41ccd082800c646a0e191b0fd912e5c15530e61c248e81ac": {
                "Name": "portainer.1.qyvvlcdqo5sewuku3eiykaplz",
                "EndpointID": "0d29e5452c208ed637ae2e7dcec026f39d2431e8e0e20765a9e0e6d6dfdc60ca",
                "MacAddress": "02:42:0a:00:00:15",
                "IPv4Address": "10.0.0.21/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4101"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "d049fc8f8ae1",
                "IP": "192.168.1.2"
            },
            {
                "Name": "6c0da128f308",
                "IP": "192.168.1.3"
            }
        ]
    }
]

on the worker node this returns the following:

[
    {
        "Name": "test-overlay",
        "Id": "skxhz8sb3f82dhh9jt9t3j5yl",
        "Created": "2018-04-20T14:04:57.870696195Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "4cb50161119e4b58a472e1b5c380c301bbb00a23fc99fc2e0712a8c4bde6d9d4": {
                "Name": "minio.1.fo2su2quv8herbmnxqfi3g8w2",
                "EndpointID": "3e85786304ed08f02c09b8e1ed6a153a3b4c2ef7afe503a1b0ca6cf341521645",
                "MacAddress": "02:42:0a:00:00:d6",
                "IPv4Address": "10.0.0.214/24",
                "IPv6Address": ""
            },
            "ce99b3788a4f9438e276e0f52a8f4d29fa09179e3e93b31b14f45339ce3c5315": {
                "Name": "load-balancer.1.j64h1eecsc05b7d397ejvedv3",
                "EndpointID": "3b7e73d27fe30151f2dc2a0ba8a5afc7f74fd283159a03a592be10e297f58d51",
                "MacAddress": "02:42:0a:00:00:d0",
                "IPv4Address": "10.0.0.208/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4101"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "d049fc8f8ae1",
                "IP": "192.168.1.2"
            },
            {
                "Name": "6c0da128f308",
                "IP": "192.168.1.3"
            }
        ]
    }
]
X0r0N
  • 1,816
  • 6
  • 29
  • 50
  • can you verify that your node2 is a worker of your node1? `docker node ls` – Trevor V Apr 13 '18 at 03:03
  • Are the two nodes in the same swarm? – Constantin Galbenu Apr 13 '18 at 12:57
  • node2 is definitely a worker of node1 and in the same swarm. i can confirm this because i can deploy services from the master to the worker with placement constraint `node.role == worker`. `sudo docker node ls` returns a table with both nodes as expected. – X0r0N Apr 13 '18 at 17:01
  • What is `docker network inspect test-overlay` showing on both nodes? Especially in `Peers` and `Containers`. – Izydorr Apr 20 '18 at 08:44
  • i have updated the question with the debug info requested. both `Peers` are listed on both nodes, but the `Containers` section only displays containers running on that particular node. – X0r0N Apr 20 '18 at 14:19
  • 1
    @X0r0N - did you solve the issue? I recently had a similar one - It was working pretty well but after some time the overlay network got "split" and containers from manager node couldn't communicate with worker node. Surprisingly, rebooting worker node helped, but this shouldn't had happen. Did you do any more research on this? – Miq May 17 '18 at 10:00
  • 1
    @Miq unfortunately i did not get it to work at all. unlike your solution, my swarm nodes have never been able to communicate to each other through the overlay network. i fixed my issues by making an entry to the `hosts` file which Portainer allows you to configure directly in the UI (so no volume mounting). it is not a good solution, but it will have to do. – X0r0N May 17 '18 at 10:11
  • What's the output from `docker network inspect test-overlay -v`? The `-v` will give you networking information on each of the services.. there have been times where a service is unreachable on 1 node, but OK on all others. This command _should_ help debug that. – Ryan Smith Jun 27 '18 at 18:24

3 Answers3

9

it seems this problem was because of the nodes being not being able to connect to each other on the required ports.

TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
UDP port 4789 for overlay network traffic

before you open those ports.

a better and simpler solution is to use the docker image portainer/agent. like the documentation says,

The Portainer Agent is a workaround for a Docker API limitation when using the Docker API to manage a Docker environment.

https://portainer.readthedocs.io/en/stable/agent.html

i hope this helps anyone else experiencing this problem.

X0r0N
  • 1,816
  • 6
  • 29
  • 50
  • 1
    Thanks a lot, X0r0N. That was the hint I needed to resolve my issue. For the record, I stumbled upon this tutorial mentioning the same ports and how to add them to eg. FirewallD: https://www.digitalocean.com/community/tutorials/how-to-configure-the-linux-firewall-for-docker-swarm-on-centos-7 – natterstefan Dec 17 '18 at 12:16
  • Sorry, can't understand, advertised addresses and opened ports, deployed using https://portainer.readthedocs.io/en/stable/deployment.html#quick-start, and in portainer network still cant see peers, but can access peers from the UI.. – Alex Dembo Dec 13 '19 at 12:44
  • Ok, a reboot helped. Spent on this 6h. Angry af. – Alex Dembo Dec 13 '19 at 14:51
  • This didn't work for me, so I disabled firewall `ufw disable` – Chenna May 19 '20 at 16:39
  • i assume you can also use `ufw allow` to allow the ports (mentioned above) explicitly instead of disabling the whole firewall. – X0r0N May 25 '20 at 10:34
  • Portainer agent isn't useful if you haven't opened the ports for the overlay ingress network that is used by the agent. You will not be able to get the service1 to connect to service2 just by adding the agent to the swarm. – Kavinda Gayashan Sep 09 '20 at 09:52
2

I am not able to leave a comment yet, but i managed to solve this issue with the solution provided by X0r0N, and i am leaving this comment to help people in my position to find a solution in the future.

I was deploying 10 Droplets in DigitalOcean, with the default Docker image provided by Docker. It says in the description that it closes all ports, but them related to Docker. This is clearly not included Swarm usecases.

After allowing port 2377, 4789 and 7946 in ufw the Docker Swarm is now working as expected.

To make this answer stand on its own, the ports map to the following functionality:

TCP port 2377: Cluster Management Communication TCP and UDP port 7649: Communication between nodes UDP port 4789: Overlay Network Traffic

0

Check if your nodes have the ports the swarm needs to operate opened properly as described here https://docs.docker.com/network/overlay/ in "Prerequisites":

TCP port 2377 for cluster management communications
TCP and UDP port 7946 for communication among nodes
UDP port 4789 for overlay network traffic
Izydorr
  • 1,926
  • 3
  • 23
  • 40
  • thanks for your reply. i have already seen that page in the docs and opened up those ports with UFW. this doesn't fix the issue. – X0r0N Apr 19 '18 at 15:21
  • this indeed seems to be a problem. my solution was to use the docker image portainer/agent. – X0r0N Jun 28 '18 at 17:37