I'm trying to get ECS Service Discovery working with Prometheus.
Currently my ECS container gets added to Route 53 like so:
+-----------------------------------------------+------+--------------------------------------------------------+
| Name | Type | Value |
+-----------------------------------------------+------+--------------------------------------------------------+
| my-service.local. | SRV | 1 1 8080 123456-7890-1234-5678-12345.my-service.local. |
| 123456-7890-1234-5678-12345.my-service.local. | A | 10.0.11.111 |
+-----------------------------------------------+------+--------------------------------------------------------+
I assume if I added more running containers to ECS, I would get more Alias records in Route 53 with the name 123456-7890-1234-5678-12345.my-service.local.
In my Prometheus configuration file, I have supplied the following under scrape_config
:
- job_name: 'cadvisor'
scrape_interval: 5s
dns_sd_configs:
- names:
- 'my-service.local'
type: 'SRV'
However, when I check the target status in Prometheus, I see the following:
Endpoint: http://123456-7890-1234-5678-12345.my-service.local:8080/metrics
State: Down
Error: context deadline exceeded
I'm not familiar with how DNS Service Discovery works with SRV records so I'm not sure where the problem lies exactly. Looking at how AWS ECS Service Discovery added the records, it looks like my-service.local
maps to 123456-7890-1234-5678-12345.my-service.local:8080
However it looks like Prometheus doesn't then try to find the list of local IPs mapped to 123456-7890-1234-5678-12345.my-service.local
and just tries to scrape from it directly.
Is there some configuration option that I'm missing to make this work or have I misunderstood something at a fundamental level?