2

We have an array of servers, any of which could go down generating a medium-priority notification:

define host {
        host_name       foo1
        contacts        medium-priority
        use     default-host
}
...

However, we'd like a higher-priority notification whenever more than two such servers are in trouble. To that end, we've set up a separate service-definition using Nagios'/Icinga's check_cluster-utility:

define service {
        service_description     foo-cluster
        servicegroups   cluster-checks
        display_name    Foo Cluster
        check_command   check_cluster_host!Foo Cluster!0!3!$HOSTSTATEID:foo1$,$HOSTSTATEID:foo2,...$HOSTSTATEID:fooN$
        contacts        high-priority
        hostgroup_name  clusters
        notes   Check, that no more than 2 hosts in group foo are in trouble
        use     default-service
}

The above will probably work, but I'd like for this service-check to be triggered not by time, but only by a change in the status of any of the "underlying" hosts...

We generate Icinga's config-files with Ansible and so can construct complex dependencies programmatically -- but can such triggering be implemented at all?

Mikhail T.
  • 2,338
  • 1
  • 24
  • 55

1 Answers1

1

You could define an event handler on the host which basically is a small script "doing something based on parameters". You can pass the host's state attributes from runtime macros as command parameters.

https://www.icinga.com/docs/icinga1/latest/en/eventhandlers.html

I would go the route and define a custom var on the host which defines the services to trigger when an event handler is fired. That way you don't need to hardcode them inside the script.

Your script may then decide to force a new service check via the external command pipe. You probably should define whether HARD or SOFT states are enough - keep in mind though that event handlers are only fired on a state change, not on DOWN->DOWN->DOWN for example.

Example: https://github.com/Icinga/icinga-core/blob/master/contrib/eventhandlers/submit_check_result.in

Note: That service should not have active checks enabled, and not use a dummy command, but the actual service check command.

(such check result submission happened in the old Nagios/Icinga1 world for somewhat hackish distributed monitoring too, if you're looking for more examples with the command pipe and event handlers).

dnsmichi
  • 845
  • 5
  • 12