We have successfully created a docker swarm with several containers and we can say that, in general, it is working fine. We are still developing the first version of our product, so there is nothing deployed on a prod environment yet. Our configuration so far is really simple (docker-compose.yml): one network, one volume, 6 services 1 replica each (with the intention to have more replicas in the future), all services are rest services, only 1 node (with the intention to have more nodes in the future).
The only strange thing we’ve noticed, and that is worrying us, is that after the swarm has been idle for a while (e.g. no incoming request on any container on a dev server) we experience high latency while reaching any of the different services in our swarm and it only happens during the first request on each service. So for example:
Swarm: Service A, Service B
Idle 30 mins --> no incoming requests for any of the services during 30 mins
Incoming request (non-cacheable request) to Service A --> responds after (aprox) 20 secs
Incoming request (non-cacheable request) to Service A --> responds immediately
... Further hits to Service A respond immediately
Incoming request (non-cacheable request) to Service B --> responds after (aprox) 20 secs
Incoming request (non-cacheable request) to Service B --> responds immediately
... Further hits to Service B respond immediately
We entered the containers (execute bash) and noticed that:
- Ping to other container IPs respond immediately
- Ping to other container names (dns) has the mentioned delay (high latency)
- The container name (dns) resolution happens immediately (so it is not latency at dns level)
More details:
- Host OS: Ubuntu 16.04
- Docker version: 17.06.0-ce
Anybody has any ideas of what could be going on here?