I'm still newbie in Control-M but we have implemented a solution with similar goals with a job hook to proof nodes.
Assumed, your target server (node) called JS which interacts with SERVER1 (let's call node01). Any number of servers / nodes can be added later, let's see with just one node.
Overview components:
- Jobs created for monitor changes and check status while OK and NOT OK status
- Quantitative resource created for each node, for example node01_run (or stacked, as you wish)
- Jobs are containing quantitative resource "node01_run" with least 1 free resource
- If everything ok, jobs should run as expected
- If downtime is recognized, quantitative resource (QR) will be changed to 0, so affected jobs should not run,
- If the node is up again, quantitative resource will be set to the original value (10, 100, 1000, ...) and the jobs should run again as usual
Job name: node01_check_resource
Job Goal ---> Check if quantitative resource already existing
Job OS Command ---> ecaqrtab LIST node01_run
Result yes ---> do nothing,
Result no ---> Job node01_create_resource, command: ecaqrtab ADD node01_run 100 (or as many as you wish)
Job name: node01_check (cyclic)
Job Goal ---> Check if node up
Job OS Command ---> As you define that you node is up: check webservice, check uptime, wmi result, ping, ...
Result up ---> rerun job in x minutes
Result no ---> go for next job
Job name: node01_up-down
Job Goal ---> Case for switching between status up and status down
Job OS Command ---> ecaqrtab UPDATE node01_run 0
On-Do action: ---> when job ended, remove condition that node01_check cannot start (as is defined as cyclic job)
Job name: node01_check_down (cyclic)
Job Goal ---> Check status while known status is down
Job OS Command ---> As defined in node01_check
Result down ---> Do nothing as job is defined as cyclic
Result up ---> Remove some contitions
Job name: node01_down-up
Job Goal ---> Switching all back to normal operation
Job OS Command ---> ecaqrtab UPDATE node01_run 100
Action ---> Add condition that node01_check can run again
You can define such job hooks for multiple nodes and you can define in each job, which nodes should be up and running (means where the quantitative resource is higher than 0). It can be monitored multiple hosts and still set the same resource - as you wish.
I hope this helps further, unless you have found a suitable solution already.