0

I've been configuring http healthchecks for all my apps in marathon which are working nicely, the trouble is marathon will keep stepping in and restarting a container failing it's healthcheck and I won't know unless I happen to be looking in the Marathon UI.

Is there a way to retrieve all apps that have a failed healthcheck so I can send an email alert or similar?

Omiron
  • 341
  • 3
  • 15

1 Answers1

0

Marathon exposes information about failing healthcheck with event bus so you can write a simple service that will consume Marathons HealthChecks Event ("eventType": "instance_health_changed_event") and translate it to metric, alert you name it.

For a reference I can recommend allegro/appcop. This is the service that scales down unhealthy applications. Its code could be easily altered to do what you want.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
janisz
  • 6,292
  • 4
  • 37
  • 70
  • 1
    Thanks, in the meantime I had written code which called the marathon api's on a schedule, i wasn't aware of this event bus – Omiron Apr 27 '19 at 21:48