3

We are using Kafka Confluent platform, it's Kafka cluster where we are using almost 10 Brokers and 5 ZK Servers.

Some of the clients requesting the data from Kafka, some times they are getting the Read Timeout errors from my Kafka side while connect with topic partitions. During the same time we observed in broker server logs, there is clear evident that ISR keep on shrinking and expanding along with Out Of Sync Replicas.

We need to understand that why ISR is Out of Sync intermittently.

We already having Zookeeper Session Timeout - 60 Sec.

[2021-06-16 18:32:30,055] INFO [Partition topicName broker=1001] Shrinking ISR from 1001,1007,1008 to 1001,1007. Leader: (highWatermark: 14317389, endOffset: 14317390). Out of sync replicas: (brokerId: 1008, endOffset: 14317389). (kafka.cluster.Partition)

[2021-06-16 18:32:30,060] INFO [Partition topicName broker=1001] ISR updated to [1001,1007] and zkVersion updated to [6434] (kafka.cluster.Partition)

[2021-06-16 18:32:31,771] INFO [ReplicaFetcher replicaId=1001, leaderId=1008, fetcherId=0] Error sending fetch request (sessionId=43827521, epoch=19413135) to node 1008: {}. (org.apache.kafka.clients.FetchSessionHandler)
java.io.IOException: Connection to 1008 was disconnected before the response was read
Majid Hajibaba
  • 3,105
  • 6
  • 23
  • 55

0 Answers0