I have got a kubernetes cluster; in the cluster, I have installed prometheus and have a spring boot service that is scaled up and down to zero; I have configured the spring boot service to use pushgateway; Pushgateway happend to be out of reach for couple of seconds during an upgrade and meanwhile PrometheusPushGatewayManager in spring boot actuator tried to push the metrics, found that pushgateway could not be located and hence shuts itself down. I am wondering, is there a way to have PrometheusPushGatewayManager try multiple times before giving up.
Asked
Active
Viewed 355 times
0
-
1You mean it fails because DNS fails ? – Michael Doubez Feb 11 '20 at 19:20
-
yes and no; just hiccup in the DNS pod, not an outage; I think, I have a workaround - if I change polling interval/push rate to relatively value(1m), then this should not happen as the upgrades/hiccups do not last more than 30s. – Venkatesh Laguduva Feb 12 '20 at 12:06
-
Retrying on a DNS failure would be weird but still safer than retrying on failed POST (which could be non-idempotent). From [the code](https://github.com/spring-projects/spring-boot/blob/aef92b9295f62d008faa9ab79905a474bf3496f3/spring-boot-project/spring-boot-actuator/src/main/java/org/springframework/boot/actuate/metrics/export/prometheus/PrometheusPushGatewayManager.java#L110) there is not much you can do programmatically. I would say your best option is to fix your DNS and/or ask spring-boot for a feature of number of failure before shutdown (easier to handle than retry). – Michael Doubez Feb 12 '20 at 21:06