Metric Absence Alert on Logs-based Metrics from Pod Triggered on Pod Reschedule

Question

Setup

Note: Using pseudo-code instance notation: ObjectType("<name>", | <attr>: <attr-value>]).

We have a Container: Container("k8s-snapshots") in a Pod("k8s-snapshots-0") in a `StatefulSet("k8s-snapshots", spec.replicas: 1)

We expect at most 1 Pod to run at any point in time.

We have a Logs-based Counter Metric("k8s-snapshots/snapshot-created") with the filter:

resource.type="container"
resource.labels.cluster_name="my-cluster"
logName="projects/my-project/logs/k8s-snapshots"
jsonPayload.event:"snapshot.created"

We have a Stackdriver Policy:

Policy(
  Name: "snapshot metric absent",
  Condition: Condition(
    Metric("k8s-snapshots/snapshot-created"),
    is absent for: "more than 30 minutes"
  )
)

In order to monitor if Container("k8s-snapshots") has stopped creating snapshots.

Expected result

An alert is triggered if no instance of Pod("k8s-snapshots-0") has logged any event matching Metric("k8s-snapshots/snapshot-created").

Result

Policy(Name: "snapshot metric absent") is violated each time Pod("k8s-snapshots-0") is rescheduled.

It seems like a sub-metric of the main logs-based metric is created for each instance of Pod("k8s-snapshots"), and Stackdriver alerts for each sub-metric.

score 0 · Answer 1 · answered Jan 19 '18 at 21:51

0

Are you still experiencing the issue? WithStackdriver API you have the ability to aggregate metrics (You can have custom metrics) which the UI does not have until now. You can also visit this link

answered Jan 19 '18 at 21:51

KarthickN

409
2
8

Metric Absence Alert on Logs-based Metrics from Pod Triggered on Pod Reschedule

Setup

Expected result

Result

1 Answers1