1

I'm trying to throw an alert if server is dropping below threshold for free memory for x number of times in a day.

{my_template:vm.memory.size[free].count(1m,5G,lt,1d)}>5
{my_template:vm.memory.size[free].count(1m,5368709120,lt,1d)}>5

I've also tried this when free memory is 9G...but failed.

{my_template:vm.memory.size[free].count(1m,5G,gt,1d)}>5
Kenny Rasschaert
  • 9,045
  • 3
  • 42
  • 58
Guest_M
  • 11
  • 3

1 Answers1

1

The Zabbix documentation for the count function specifies the options as follows:

count (sec|#num,,,)

With regards to time_shift, this explains in more detail what it does.

Several functions support an additional, second time_shift parameter. This parameter allows to reference data from a period of time in the past. For example, avg(1h,1d) will return the average value for an hour one day ago.

Your examples use 1m in the first argument, which means that they only look at a time period of one minute, and by time shifting it 1d, you're looking at a time period of 1 minute, exactly 24 hours ago. That doesn't seem like what you want to watch.

You seem to be using the second and third parameters correctly, as well as the operator outside the function.

To get the trigger as you described it, I'd forgo the time_shift and set the first parameter to 1d.

This is probably closer to what you describe:

{my_template:vm.memory.size[free].count(1d,5368709120,lt)}>5

It's important to note, however, that the count function is heavily reliant on how many data points have been gathered in the specified time period, which depends on the item monitoring interval.

In this example below, Zabbix is listing the data gathered for memory in the past 24 hours. Since the interval is set to 30 seconds, that gives 2880 data points.

zabbix screenshot

When you say you want to the trigger to fire after the count function returns >5, what that means is that it will fire when more than 5/2880 data points meet the criteria.

This can be >5 points spread throughout the day, or >5 consecutive points, meaning that it happened once, for 2.5 minutes.

What would probably be a better idea would be to create a new Calculated item. Let's call it "5 minute memory dip". I'll give it the key "foo.bar.free.memory.low". It could use this formula:

max(vm.memory.size[free], 5m)<5368709120

It will store a 1 when the highest value for free memory in the last 5 minutes was below 5G, otherwise, a 0.

Then, create a trigger based on that new item:

{my_template:foo.bar.free.memory.low.count(1d,0,gt)}>5

This trigger will fire when there have been >5 such dips in the past day.

This method should really cut down on the false positives and more reliably count the real memory dips.

Kenny Rasschaert
  • 9,045
  • 3
  • 42
  • 58
  • It worked...thank you so much :) So this will alert if it total number of times it dropped below 5gb if more than 5 right away – Guest_M Jul 31 '19 at 14:06
  • for test have added {my_template:vm.memory.size[free].count(1d,5G,gt)}>50 Current free mem is 9G and item is set for 5s interval Trigger was up for the minute Acc to my understanding it will count to 12 in 1 min and it will be over 50 at around 4min 5sec so it should fire up after 4min right? – Guest_M Jul 31 '19 at 14:28
  • Yes, I think you are right, you will need to factor your interval into things. If you have a very fast checking interval, you will reach your threshold for "count" sooner. – Kenny Rasschaert Jul 31 '19 at 14:43
  • If it is changed to 30s or 1m also it triggered after a min – Guest_M Jul 31 '19 at 14:48
  • any way to make it work ? – Guest_M Aug 01 '19 at 13:48
  • I've modified my answer and added a new suggestion. – Kenny Rasschaert Aug 02 '19 at 12:56