I have an HPC cluster and I would like to monitor its health with Icinga2. I have a number of checks defined for each node in the cluster, but what I would really like is to get a notification if more than a certain percentage of the nodes are sick.
I notice that is possible to define a dummy host which represents the cluster and use the Icinga domain specific language to achieve something like I'm interested (http://docs.icinga.org/icinga2/latest/doc/module/icinga2/chapter/advanced-topics?highlight-search=up_count#access-object-attributes-at-runtime). However this seems like an inelegant and awkward solution.
Is it possible to define this kind of "aggregate" or "meta check" over a hostgroup?