0

I have this trigger that fires upon a match of the rule below:

{monitoring:test.item.change(0)}<-100

When my graph goes down by over 100 units, an event gets created. The event should switch to OK status when the graph goes back up. The graph has different average values at different times of day and besides, the item is a trapper value, which does not support flexible intervals. My problem is this; when the graph falls by over 100 units, let's say from 300 to 10, a PROBLEM situation is created. At the next interval, if the value is still low (e.g 13), Zabbix creates an OK event, because although the value is still low, the expression does not return true because the graph hasn't gone down by a further 100 units. Any ideas on how I could fix this? I have been trying to use

{{monitoring:test.item.avg(1800)}-{monitoring:test.item.last(0)}>100}

but Zabbix wouldn't take that expression. This is supposed to compare the last value of test.item to the average value of the past 30 minutes and raise an alert when the difference exceeds 100.

This, I believe, would sort out my problem situation of a false OK status when the graph remains at a low value.

EDIT: I think I have cracked it. Zabbix has accepted the below expression:

{monitoring:test.item.avg(1800)}-{monitoring:test.item.last(0)}>100
Richlv
  • 3,954
  • 1
  • 17
  • 21
170730350
  • 590
  • 1
  • 8
  • 22

1 Answers1

1

I think you'll soon realize that expression won't solve your targeted behavior and will keep on flapping between PROBLEM and OK.

You have just shifted the 'did a -100 change occurred' check between 'the last and previous last' values
to 'the last and the average of the last half an hour'.

Checking if either there was an abrupt change OR
if the value is still too low will probably better mimic your expected scenario,

{monitoring:test.item.last(0)}>100 | {monitoring:test.item.max(#2)}<20

max(#2)<20 checks if the maximum of the last 2 values is bellow 20.

EDIT: After reading your comment maybe this approach (after some tweaking for your expected values) will better serve you,

({monitoring:test.item.avg(1800)}<10 & {monitoring:test.item.avg(1800)}-{monitoring:test.item.last(0)}>20) | ({monitoring:test.item.avg(1800)}>100 & {monitoring:test.item.avg(1800)}-{monitoring:test.item.last(0)}>100)

This way, you'll better fit your trigger for the different volumes during the day.

Joao Figueiredo
  • 3,120
  • 3
  • 31
  • 40
  • I am actually working with the average for the last half-hour, not for the value posted half an hour ago. Also note that the values have greatly varying averages during different times of the day. Between 2 and 4 AM the average is about 5, while between 6 PM and 8 PM, the average goes as high as 376 units. I am trying not to have false alerts generated during off-peak hours because of the low values, that's why I am working with averages. As the day wears on, the graph goes up smoothly rather than sharply as a spike. The reverse is also true for when the graph is going up. – 170730350 Aug 21 '12 at 09:22
  • You're right. I misread your expression. Now your goal is clearer. – Joao Figueiredo Aug 21 '12 at 10:07