0

I am creating a panel to show the instance's health status. In LOKI, If "ERROR" is present in the log then the instance status should be in Red else it should be in Green.

I'm using the following query,

{component="dz-snmp", cloud=~"${cloud}", environment=~"${environment}", location=~"${location}",service="dz"} |= "ERROR"

I tried to visualize it using Gauge. And I got the expected result when an instance is in unhealthy state, I set the threshold value as 1 so if the log has the "ERROR" keyword then the visualization will be turnout to RED. But the problem is When there's no error(healthy state) it shows as No data.

When the instance log has no Error, it should be Green and when has an ERROR it needs to turn Red. How could I achieve it?

Any help at all would be a great help!

Roopchand
  • 33
  • 5

2 Answers2

1

Try to use the following query:

count_over_time({component="dz-snmp", cloud=~"${cloud}", environment=~"${environment}", location=~"${location}",service="dz"} |= "ERROR"[$__range])

And set the following Gauge panel option:

enter image description here

  • It helped, Thanks!! Now I can see 0 with green but the problem is it's showing entirely not for each instance. I want to check each instance and if there's no error then that one needs to show in green and if there's an error then that one needs to show in Red in the same panel. – Roopchand Aug 29 '22 at 03:25
  • @Vijay, the metric query does not now anything about instances, when there is no data. I think you need a second query to define which instances need to be to displayed. There must be some other log output that you can query to get these instances. – Sascha Doerdelmann Aug 31 '22 at 08:09
0

I came across some solution that might be a little bit too komplex for what you've asked for, but at least it works:

Use the aggregate query as given by Marcelo Ávila de Oliveira as query A.

Add the following query as query B:

count_over_time({component="dz-snmp", cloud=~"${cloud}", environment=~"${environment}", location=~"${location}",service="dz"} != "ERROR"[$__range]) - count_over_time({component="dz-snmp", cloud=~"${cloud}", environment=~"${environment}", location=~"${location}",service="dz"} != "ERROR"[$__range])

Query B uses the inverted filter != "ERROR" and returns 0 for any match.