I have the following docker-compose.yml
:
version: '3.7'
services:
gateway:
image: rmilejcz/kalos-gateway:latest
deploy:
replicas: 1
ports:
- '443:443'
networks:
- rpcnet
rpc:
image: rmilejcz/kalos-rpc:latest
deploy:
replicas: 1
ports:
- '8418:8418'
networks:
- rpcnet
proxy:
image: rmilejcz/grpcwebproxy:latest
deploy:
replicas: 1
ports:
- '8080:8080'
networks:
- rpcnet
networks:
rpcnet:
It is essentially an rpc
server with two separate reverse proxies, gateway
translates normal HTTP requests and forwards them to rpc
and proxy
translates gRPC-web requests and forwards them to rpc
.
When I run this via docker-compose up
it works as expected and this is easily confirmed by running:
curl localhost:443/v1/lookup/vendor
However when I try to run this in a swarm:
docker swarm init
docker deploy --compose-file docker-compose.yml test
# OR
docker stack deploy --compose-file docker-compose.yml test
The previously working curl
example returns:
all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp: lookup rpc on 127.0.0.11:53: no such host\"
meaning that the rpc
service is not available. Not sure where 127.0.0.11:53
comes from, I'm guessing 127.0.0.11
is what rpc
resolves to but I'm not sure what :53
is derived from.
docker service ls test_rpc
shows REPLICAS
at 0/1
. I'm almost certain that for whatever reason, the rpc
service fails to bind to rpc:8418
because if I change that to localhost:8418
and run docker service ls test_rpc
I can see that REPLICAS
is at 1/1
, however I am still unable to communicate with that service via either proxy due to the same error above (all subconns in transient failure)
Am I making a bad assumption about container communication within a docker swarm? Is there any way for me to get detailed error information from the rpc
service to determine exactly why it is failing? If I run docker-compose up
I can see the services stdout in my terminal, is there some similar capability for docker swarm?