0

I have an alarm that for months has worked properly to manage the size of my ASG. Since Monday (Oct. 12), though, it has stopped working; it stays in "OK" state even when the graphs clearly show that it is above the threshold. See the attached screen shot.

What may or may not be related is that the alarm will trigger, then fail with no error message. It looks like this happens when the alarm triggers during the cooldown stage of the ASG. Once this happens, the alarm reverts to "OK", then just stays there indefinitely, even though it is above the threshold. Before Monday, it would stay in alarm state, re-triggering repeatedly, until the ASG left cooldown state.

Anybody know what is going on here? How can I fix this? And why did it suddenly change when there were no changes on my side? Screen shot of problem, showing "OK" alarm that is above threshold

Bill Shubert
  • 496
  • 3
  • 12

1 Answers1

0

I can see some missing data around 15:15 and your Missing data treatment is set to 'Treat missing data as good', shall we change this to 'Treat missing data as ignore (maintain the alarm state)' and check?

Cnf271
  • 302
  • 5
  • 16
  • The missing data happens, it has been there all along, I'm pretty sure that it's just jitter in when new metrics get submitted. "Treat missing data as good" would indeed have switched it to "good" for one period, but then as soon as data came back it should have bounced back to alarm. But you're right, probably better to set it to ignore; I'll make that change, although I don't think it will fix the big problem here. – Bill Shubert Oct 15 '20 at 19:17
  • Well, this morning it was working again. Was it the flag change? Maybe, but I don't really think so, because for months it worked with the flag set to "treat missing as good." I suspect that a rollout of new alarm or ASG code broke this, then it was fixed. But anyway, since it is working and nobody else gave a better (or any!) answer, I'll mark your recommendation as a solution. Thanks! – Bill Shubert Oct 16 '20 at 14:05