I'm currently struggling a lot to spin up a small traefik example on my docker swarm instance.
I started first with an docker-compose file for local development and everything is working as expected. But when I define this as swarm file to bring that environment into production I always get an Bad Gateway from traefik.
After searching a lot about this it seems to be related to an networking issue from traefik since it tries to request between two different networks, but I'm not able to find the issue.
After certain iterations I tried to reproduce the Issue with "official" containers to provide an better example for other people.
So this is my traefik.yml
version: "3.7"
networks:
external:
external: true
services:
traefik:
image: "traefik:v2.8.1"
command:
- "--log.level=INFO"
- "--accesslog=true"
- "--api.insecure=true"
- "--providers.docker=true"
- "--providers.docker.swarmMode=true"
- "--providers.docker.exposedbydefault=false"
- "--providers.docker.network=external"
- "--entrypoints.web.address=:80"
- "--entrypoints.web.forwardedHeaders.insecure"
ports:
- "80:80"
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- external
deploy:
placement:
constraints: [node.role == manager]
host-app:
image: traefik/whoami
ports:
- "9000:80"
networks:
- external
deploy:
labels:
- "traefik.enable=true"
- "traefik.http.routers.host-app.rule=PathPrefix(`/whoami`)"
- "traefik.http.services.host-app.loadbalancer.server.port=9000"
- "traefik.http.routers.host-app.entrypoints=web"
- "traefik.http.middlewares.host-app-stripprefix.stripprefix.prefixes=/"
- "traefik.http.routers.host-app.middlewares=host-app-stripprefix@docker"
- "traefik.docker.network=external"
The network is created with: docker network create -d overlay external
and I deploy the stack with docker stack deploy -c traefik.yml server
Until here no issues and everything spins up fine.
When I curl localhost:9000 I get the correct response:
curl localhost:9000
Hostname: 7aa77bc62b44
IP: 127.0.0.1
IP: 10.0.0.8
IP: 172.25.0.4
IP: 10.0.4.6
RemoteAddr: 10.0.0.2:35068
GET / HTTP/1.1
Host: localhost:9000
User-Agent: curl/7.68.0
Accept: */*
but on
curl localhost/whoami
Bad Gateway%
I always get the bad Gateway issue.
So I checked my network with docker network inspect external
to ensure that both are running in the same network and this is the case.
[
{
"Name": "external",
"Id": "iianul6ua9u1f1bb8ibsnwkyc",
"Created": "2022-08-09T19:32:01.4491323Z",
"Scope": "swarm",
"Driver": "overlay",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "10.0.4.0/24",
"Gateway": "10.0.4.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"7aa77bc62b440e32c7b904fcbd91aea14e7a73133af0889ad9e0c9f75f2a884a": {
"Name": "server_host-app.1.m2f5x8jvn76p2ssya692f4ydp",
"EndpointID": "5d5175b73f1aadf2da30f0855dc0697628801a31d37aa50d78a20c21858ccdae",
"MacAddress": "02:42:0a:00:04:06",
"IPv4Address": "10.0.4.6/24",
"IPv6Address": ""
},
"e23f5c2897833f800a961ab49a4f76870f0377b5467178a060ec938391da46c7": {
"Name": "server_traefik.1.v5g3af00gqpulfcac84rwmnkx",
"EndpointID": "4db5d69e1ad805954503eb31c4ece5a2461a866e10fcbf579357bf998bf3490b",
"MacAddress": "02:42:0a:00:04:03",
"IPv4Address": "10.0.4.3/24",
"IPv6Address": ""
},
"lb-external": {
"Name": "external-endpoint",
"EndpointID": "ed668b033450646629ca050e4777ae95a5a65fa12a5eb617dbe0c4a20d84be28",
"MacAddress": "02:42:0a:00:04:04",
"IPv4Address": "10.0.4.4/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.driver.overlay.vxlanid_list": "4100"
},
"Labels": {},
"Peers": [
{
"Name": "3cb3e7ba42dc",
"IP": "192.168.65.3"
}
]
}
]
and by checking the traefik logs I get the following
10.0.0.2 - - [09/Aug/2022:19:42:34 +0000] "GET /whoami HTTP/1.1" 502 11 "-" "-" 4 "host-app@docker" "http://10.0.4.9:9000" 0ms
which is the correct server:port for the whoami service. And even connecting into the traefik container and ping 10.0.4.9 works fine.
PING 10.0.4.9 (10.0.4.9): 56 data bytes
64 bytes from 10.0.4.9: seq=0 ttl=64 time=0.066 ms
64 bytes from 10.0.4.9: seq=1 ttl=64 time=0.057 ms
^C
--- 10.0.4.9 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.057/0.061/0.066 ms
This logs and snippets are all on my local swarm on Docker for Windows with wsl2 Ubuntu distribution. But I tested this on an CentOS Swarm which can be requested within my company and also with https://labs.play-with-docker.com/ and leads all to the same error.
So please can anybody tell me what configuration I'm missing or what mistake I made to get this running?