I have a docker swarm running our business stack defined in a docker-compose.yml on two servers (nodes). The docker-compose has defined cAdvisor starting on each of the two nodes like that:
cadvisor:
image: gcr.io/google-containers/cadvisor:latest
command: "--logtostderr --housekeeping_interval=30s"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /:/rootfs:ro
- /var/run:/var/run
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk:/dev/disk/:ro
ports:
- "9338:8080"
deploy:
mode: global
resources:
limits:
memory: 128M
reservations:
memory: 64M
On a third server I run a docker separately from the docker swarm on node 1 and 2 and this server is used to run Prometheus and Grafana. Prometheus is configured to scrape only the node1:9338 resource to get the cAdvisor information.
I occasionally get the problem that when scraping node1:9338 not all containers running on both nodes 1 and 2 are shown in the cAdvisor statistics.
I was assuming that cAdvisor is synching its information in the swarm so that I'm able to configure Prometheus to only use node1:9338 as entrypoint into the docker swarm and scraping the information.
Or do I have to also put node2:9338 into my Prometheus configuration to always get all information of all nodes? If yes, how should this scale then because I would need to add each new node to the Prometheus config.
Running Prometheus together with the business stack in one swarm is no option.
edit: I experienced today when opening the cAdvisor metrics URL http://node1:9338/metrics as well as http://node2:9338/metrics a strange behaviour as I see the same information of all containers running on node1 on both URLs. The information of the containers running on node2 are missing when requesting http://node2:9338/metrics.
Could it be that the docker-internal load balancing is routing the request from http://node2:9338/metrics to the node1:9338 cAdvisor so the metrics of node1 are shown despite node2 is requested?