I have a swarm cluster in which I created a global service to run on all docker hosts in the cluster.
The goal is to have each container instance for this service connect to a port listening on the docker host.
For further information, I am following this Docker Daemon Metrics guide for exposing the new docker metrics API on all hosts and then proxying that host port into the overlay network so that Prometheus can scrape metrics from all swarm hosts.
I have read several docker github issues #8395 #32101 #32277 #1143 - from this my understanding is the same as outlined in the Docker Daemon Metrics. In order to connect to the host from within a swarm container, I should use the docker-gwbridge network which by default is 172.18.0.1.
Every container in my swarm has a network interface for the docker-gwbridge network:
326: eth0@if327: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:ff:00:06 brd ff:ff:ff:ff:ff:ff
inet 10.255.0.6/16 scope global eth0
valid_lft forever preferred_lft forever
inet 10.255.0.5/32 scope global eth0
valid_lft forever preferred_lft forever
333: eth1@if334: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether 02:42:ac:12:00:04 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.4/16 scope global eth1
valid_lft forever preferred_lft forever
Also, every container in the swarm has a default route that is via 172.0.0.1:
/prometheus # ip route show 0.0.0.0/0 | grep -Eo 'via \S+' | awk '{ print $2 }'
172.18.0.1
/prometheus # netstat -nr | grep '^0\.0\.0\.0' | awk '{print $2}'
172.18.0.1
/prometheus # ip route
default via 172.18.0.1 dev eth1
10.0.1.0/24 dev eth2 src 10.0.1.9
10.255.0.0/16 dev eth0 src 10.255.0.6
172.18.0.0/16 dev eth1 src 172.18.0.4
Despite this, I cannot communicate with 172.18.0.1 from within the container:
/ # wget -O- 172.18.0.1:4999
Connecting to 172.18.0.1:4999 (172.18.0.1:4999)
wget: can't connect to remote host (172.18.0.1): No route to host
On the host, I can access the docker metrics API on 172.18.0.1. I can ping and I can make a successful HTTP request.
- Can anyone shed some light as to why this does not work from within the container as outlined in the Docker Daemon Metrics guide?
- If the container has a network interface on the 172.18.0.1 network and has routes configured for 172.18.0.1 why do pings fail to 172.18.0.1 from within the container?
- If this is not a valid approach for accessing a host port from within a swarm container, then how would one go about achieving this?
EDIT: Just realized that I did not give all the information in the original post. I am running docker swarm on a CentOS 7.2 host with docker version 17.04.0-ce, build 4845c56. My kernel is a build of 4.9.11 with vxlan and ipvs modules enabled.
After some further digging I have noted that this appears to be a firewall issue. I discovered that not only was I unable to ping 172.18.0.1 from within the containers - but I was not able to ping my host machine at all! I tried my domain name, the FQDN for the server and even its public IP address but the container could not ping the host (there is network access as I can ping google/etc).
I disabled firewalld on my host and then restarted the docker daemon. After this I was able to ping my host from within the containers (both domain name and 172.18.0.1). Unfortunately this is not a solution for me. I need to identify what firewall rules I need to put in place to allow container->host communication without requiring firewalld being disabled.