0

I have a streaming job implemented on top of Apache Beam, which reads messages from Apache Kafka, processes them and outputs them into BigTable.

I would like to get throughput metrics of ingress/egress inside this job i.e. how many msg/sec the job is reading and how many msg/sec it's writing.

Looking at graph visualization I see that there is throughput metric e.g. take a look at below exemplary picture for demonstration

throughput example

However looking at documentation it's not available on Stackdriver.

Is there any existing solution to get this metrics ?

marknorkin
  • 3,904
  • 10
  • 46
  • 82

1 Answers1

2

We are looking into publishing a throughput metric to Stackdriver, but one does not currently exist. The ElementCount (element_count in Stackdriver) metric is the only metric available to that UI or through Stackdriver that could be used to measure throughput. If that's displaying on the graph, it must be some computation over that metric. Unfortunately, the metric is exported as a Gauge metric to Stackdriver, so it can't be directly interpreted as a rate in Stackdriver.

A small secondary point, Dataflow doesn't actually export a metric measuring flow into and out external sources. The ElementCount metric measures flow into inter-transform collections only. But as long as your read / write transforms are basically pass throughs, the flow into / out of the adjacent collection should be sufficient.

Andrea
  • 191
  • 3