1

My company would like to automatically scale the activity workers and each workflow workers independently according to the load of a tasklist.

Reading the docs I have found the following metrics for activity workers:

  • cadence_activity_scheduled_to_start_latency_bucket
  • cadence_activity_scheduled_to_start_latency_count
  • cadence_activity_scheduled_to_start_latency_sum

However these seem to be global metrics for activity workers. Is there a Cadence metric that would allow me to spot overloads for each specific activity worker?

Example: We have 4 different activity workers : A, B, C and D We would like to scale independently A or B or C or D without impacting the others

Sharan Foga
  • 135
  • 8

1 Answers1

0

Understand scheduled_to_start_latency

scheduled_to_start_latency is a measurement of the time from scheduled to started by worker. From scheduled to started, a task is transferred from matching service to an activity worker.

These are the potential hotspots when this latency got high:

How to monitor activity worker being overloaded

  • CPU/memory/Thread usage/Garbage collection of the activity worker is usually enough to make sure an worker is not overloaded
  • You can also use scheduled_to_start_latency, but the high latency could mean different things like above. Use other metrics to rule out the causes.
Long Quanzheng
  • 2,076
  • 1
  • 10
  • 22