0

Introduction:

Previously, I saw a similar question (this link), but mine is different as we use Kafka KRaft instead of Kafka with Zookeeper.

Specification:

Kafka version: 3.3.1
Number of brokers: 8
Minimum replication factor of topics: 3

Problem Description:

At the time of writing, I had experienced this issue numerous times. Kafka's log can be found here:

[2023-01-09 09:53:03,929] WARN [Controller 3] maybeFenceReplicas: failed with unknown server exception NotLeaderException at epoch 2641 in 1913 us.  Renouncing leadership and reverting to the last committed offset 9986340. (org.apache.kafka.controller.QuorumController)
org.apache.kafka.raft.errors.NotLeaderException: Append failed because the replication is not the current leader
        at org.apache.kafka.raft.KafkaRaftClient.lambda$append$27(KafkaRaftClient.java:2262)
        at java.base/java.util.Optional.orElseThrow(Optional.java:408)
        at org.apache.kafka.raft.KafkaRaftClient.append(KafkaRaftClient.java:2261)
        at org.apache.kafka.raft.KafkaRaftClient.scheduleAtomicAppend(KafkaRaftClient.java:2257)
        at org.apache.kafka.controller.QuorumController$ControllerWriteEvent$1.apply(QuorumController.java:813)
        at org.apache.kafka.controller.QuorumController$ControllerWriteEvent$1.apply(QuorumController.java:792)
        at org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:903)
        at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:791)
        at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
        at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
        at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
        at java.base/java.lang.Thread.run(Thread.java:829)
[2023-01-09 09:53:03,931] INFO [Controller 3] writeNoOpRecord: failed with NotControllerException in 415741179 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] writeNoOpRecord: failed with NotControllerException in 206629449 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] maybeFenceReplicas: failed with NotControllerException in 206629220 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] maybeFenceReplicas: failed with NotControllerException in 206626538 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] maybeFenceReplicas: failed with NotControllerException in 205746648 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] maybeFenceReplicas: failed with NotControllerException in 7549 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] maybeFenceReplicas: failed with NotControllerException in 6986 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] maybeFenceReplicas: failed with NotControllerException in 6399 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,931] INFO [Controller 3] maybeFenceReplicas: failed with NotControllerException in 5912 us (org.apache.kafka.controller.QuorumController)
[2023-01-09 09:53:03,932] ERROR [Controller 3] Unexpected exception while executing deferred write event maybeFenceReplicas. Rescheduling for a minute from now. (org.apache.kafka.controller.QuorumController)
org.apache.kafka.common.errors.UnknownServerException: org.apache.kafka.raft.errors.NotLeaderException: Append failed because the replication is not the current leader
Caused by: org.apache.kafka.raft.errors.NotLeaderException: Append failed because the replication is not the current leader
        at org.apache.kafka.raft.KafkaRaftClient.lambda$append$27(KafkaRaftClient.java:2262)
        at java.base/java.util.Optional.orElseThrow(Optional.java:408)
        at org.apache.kafka.raft.KafkaRaftClient.append(KafkaRaftClient.java:2261)
        at org.apache.kafka.raft.KafkaRaftClient.scheduleAtomicAppend(KafkaRaftClient.java:2257)
        at org.apache.kafka.controller.QuorumController$ControllerWriteEvent$1.apply(QuorumController.java:813)
        at org.apache.kafka.controller.QuorumController$ControllerWriteEvent$1.apply(QuorumController.java:792)
        at org.apache.kafka.controller.QuorumController.appendRecords(QuorumController.java:903)
        at org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:791)
        at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
        at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
        at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
        at java.base/java.lang.Thread.run(Thread.java:829)

And

ERROR [Controller 3] processBrokerHeartbeat: unable to start processing because of NotControllerException. (org.apache.kafka.controller.QuorumController)

As this is our production node, we constantly monitor it using Prometheus and Grafana. The timestamp indicates that this broker had trouble at 2023-01-09 09:53. According to the monitoring, the other 7 brokers should be working properly and data-loss shouldn't occur, but the results from the monitoring are different from what we expected.

monitoring screenshot

This issue has happened again at 11:31.

Observations:

In this case, I assume that there is no data loss based on the monitoring screenshots and the topic messages.

Is this correct? How can we prevent this issue from recurring?

Mostafa Ghadimi
  • 5,883
  • 8
  • 64
  • 102

1 Answers1

0

When the NotLeaderException is thrown, the Kafka Producer should retry the write until successful. This exception normally occurs if the leader broker fails and a new leader has not yet been elected.

It's quite hard to detect whether or not there was data loss from the monitoring graph. It looks like during the times the leader failed, messages dropped because the producers will have been receiving the NotLeaderException until the new leader was elected. Once the new leader was elected, the producers were able to continue as normal.

This does not necessarily mean there was no data loss though. It's the producers responsibility to ensure no data loss.

For example if acks=0 and a message was sent to the topic but was not received successfully before the leader failed, that message would not exist in the topic, however, the producer would have assumed a successful write and move on to the next message.

To ensure message availability and durability, producers should have the following configurations set:

acks=all

For a write to be considered successful, it needs to be acknowledged by all replicas in the ISR

min.insync.replicas >=2 

At least 2 replicas must be in sync before the write is considered successful.

Depending on what vendor you use, some of the above configurations are set by default.

Hope this helps!

Jessica Vasey
  • 382
  • 1
  • 7