-1

I am trying to create an alert in splunk such that if there is a expression "Error occured due to connection" present in logs and if this is not remediated automatically after the 5 min it should generate an alert.

Here remediation can be if the "Error occured due to connection" doesnot occur in next five minutes after the alert is generated, it means issue is fixed. Is this possible? pls guide.

knowledge20
  • 1,006
  • 3
  • 14
  • 25

2 Answers2

1

If the "Error occured due to connection" message appears every 5 minutes until the problem is corrected, then you should be able to detect remediation by counting the number of messages in the last 6 minutes.

index=foo "Error occured due to connection" earliest=-6m
| stats count
| where count > 1
RichG
  • 9,063
  • 2
  • 18
  • 29
  • we can get multiple messages in 5 min. Its like we get connection error it might get resolved automatically. Basically we are giving it time of 5 min to heal itself then in the 6 th minute we can check if the count is >0. Is there a way we add 6th min here instaed of last 6 minutes. What should we do to check that this scenario is tested when we have connection error in last 5 min – knowledge20 Sep 24 '21 at 18:24
  • Or we can have count in 6th min > count in 5th min – knowledge20 Sep 24 '21 at 18:28
  • is it possible to calculate the time. Suppose my current time is 27 sep 2021 09:45:50 then I want to calculate the data for 5 min prior i.e. 9.40.50 to 9:45:50 and data for 9:35:50 to 9:40:50 time...Can you help me in such a query. – knowledge20 Sep 27 '21 at 01:11
  • can you pls help – knowledge20 Sep 27 '21 at 02:25
0

It's not clear what the desired results are since the requirements keep changing. Perhaps this will help solve the problem.

index=foo "Error occured due to connection" earliest=-15m
| bin span=5m _time
| stats count by _time
| ```something else to get the final results```
RichG
  • 9,063
  • 2
  • 18
  • 29