1

We have SumoLogic alert that happens if more than 10 errors logged in 60 min.
I prefer to have something like: 

  1. if there is a spike and all the errors happen in e.g. 1 minute ( consider as issue has been auto resolved ) do not generate alert.

How can I set such sumoLogic query?

Variances of the requirements :

  1. Logs have clientIp field, and if all errors are reported for the same client, do not generate alert( problem with particular client, not with application)

  2. if more than 10 errors logged in 60 min, send an alert, unless the errors are of type A, but if there are more than 100 errors of type A, send the alert.( log errors of type A are acceptable, unless the number is too big)

  3. if more than 10 errors logged in 60 min, send an alert Only if the last error happened less than 30 min ago(otherwise consider as auto-fixed)

Michael Freidgeim
  • 26,542
  • 16
  • 152
  • 170

1 Answers1

1

I am not fully sure how is your data shaped, but...

if there is a spike and all the errors happen in e.g. 1 minute ( consider as issue has been auto resolved ) do not generate alert.

This you can solve by aggregating:

| timeslice 1m
| count by _timeslice
| where _count > 1

or similar.

if all errors are reported for the same client, do not generate alert

It sounds like:

| count by _timeslice, clientIp

would do the job.

if more than 10 errors logged in 60 min, send an alert, unless the errors are of type A, but if there are more than 100 errors of type A,

Rough sketch of the query clause would be:

| if(something, 1, 0) as is_of_type_A
| count by is_of_type_A, ...
| where (is_of_type_A = 1 and _count > 100)
       OR (is_of_type_A = 0 and _count > 10)

Disclaimer: I am currently employed by Sumo Logic.

Michael Freidgeim
  • 26,542
  • 16
  • 152
  • 170
Grzegorz Oledzki
  • 23,614
  • 16
  • 68
  • 106