How to Measure End-to-End Latency in a Kafka Scenario with JMX Metrics?

Question

I have set up a Kafka scenario, including three servers: one with a Kafka producer, one with a Kafka server, and one with a consumer application.

Now, I need to measure the average end-to-end latency of Kafka messages from the producer to the consumer, including the latency of each part. My partition replication factor is 1, so there are no followers.

Based on the official documentation, I believe the composition of end-to-end latency and corresponding JMX metrics are as follows:

Network transmission time from the producer to the Kafka server (unknown).
Time spent waiting in the request queue on the Kafka server (kafka.network:type=RequestMetrics,name=RequestQueueTimeMs).
Leader processing time on the Kafka server (kafka.network:type=RequestMetrics,name=LocalTimeMs,request=Produce).
Time spent waiting to be fetched within the Kafka server (unknown).
Network transmission time from the Kafka server to the consumer (fetch-latency-avg for consumers - kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer, divided by 2).
Actual consumption time by the consumer (measured by the application).

The overall latency (1-6) can be obtained by calculating timestamps.

Is my understanding correct? Are there any methods to measure the latencies of parts 1 and 4?

score 0 · Answer 1 · answered Aug 01 '23 at 18:32

0

Kafka messages have a timestamp when added to the producer buffer. When you consume, you can track System.currentTimeMillis() - record.timestamp(). This will naturally be the addition of each of the parts you mentioned

answered Aug 01 '23 at 18:32

OneCricketeer

179,855
19
132
245

Thank you for your response! However, I would like a more detailed latency distribution chart, so I would like to know the latency of parts 1 or 4 in my question. – KatherineRan Aug 02 '23 at 01:47
Once the message reaches the broker, it's part of a batch, and consumers fetch partial batches as well all at once, so you won't be able to track at that granular level... Instead, try using distributed tracing https://github.com/openzipkin-contrib/brave-kafka-interceptor – OneCricketeer Aug 02 '23 at 13:39

How to Measure End-to-End Latency in a Kafka Scenario with JMX Metrics?

1 Answers1