6

I've setup a docker monitoring stack using Prometheus, Grafana and cAdvisor. While using this query to get running containers:

count_scalar(container_last_seen{name=~container1|container2})

It picks up the containers allright, as soon as i launch a new container it is picked up right away. The problem is when a container is stopped or removed it does not pick it up, it still shows it as a running container.

From the cAdvisor/metrics endpoint it is removed as soon as the container stops.

Is there something wrong with the query?

(this is what i used for the stack: https://github.com/vegasbrianc/prometheus)

A.Jac
  • 1,443
  • 3
  • 17
  • 24

1 Answers1

4

It seems to be related to the amount of time cAdvisor stores the data in memory.

While cAdvisor keeps the data in memory, you still have a valid date in container_last_seen metric. So the count_scalar instruction still 'sees' the container as it has a valid value.

In my test setup, cAdvisor keeps the data during 5 minutes. After this duration, I get the right information out of your formula because the container_last_seen metric has disappeared.

You can change this cAdvisor configuration with the --storage_duration flag.

--storage_duration=2m0s: How long to store data.

As an alternative if you wan't quick alerting, you could also consider running a query that would compare last seen date with current date:

count_scalar(time()-container_last_seen{name=~"container1|container2"}<=60)
François Maturel
  • 5,884
  • 6
  • 45
  • 50