0

In my Spring Boot project where I have a number of Spring Kafka consumers, I have added a number of event listeners to monitor the health of these consumers. Here is the code:

@Component
public class ApplicationContextListeningService {

    @EventListener
    public void handleConsumerPausedEvent(ConsumerPausedEvent event) {
        LOGGER_ERROR.warn(WARNING_KAFKA_CONSUMERPAUSEDEVENT + event.getSource() + LOG_MSG_DELIMITER + event.toString());
    }

    @EventListener
    public void handleConsumerResumedEvent(ConsumerResumedEvent event) {
        LOGGER_ERROR.warn(WARNING_KAFKA_CONSUMERRESUMEDEVENT + event.getSource() + LOG_MSG_DELIMITER + event.toString());
    }

    @EventListener
    public void handleConsumerStoppedEvent(ConsumerStoppedEvent event) {
        LOGGER_ERROR.error(ERROR_KAFKA_CONSUMERSTOPPEDEVENT + event.getSource() + LOG_MSG_DELIMITER + event.toString());
    }

    @EventListener
    public void handleListenerContainerIdleEvent(ListenerContainerIdleEvent event) {
        LOGGER_ERROR.error(ERROR_KAFKA_LISTENERCONTAINERIDLEEVENT + event.getListenerId() + LOG_MSG_DELIMITER + event.toString());
    }

    @EventListener
    public void handleNonResponsiveConsumerEvent(NonResponsiveConsumerEvent event) {
        LOGGER_ERROR.error(ERROR_KAFKA_NONRESPONSIVECONSUMEREVENT + event.getListenerId() + LOG_MSG_DELIMITER + event.toString());
    }

}

Does anyone know under what circumstances these events will be thrown (and maybe how I can manually trigger these events for testing purposes)? And also for the last three events (ConsumerStoppedEvent, ListenerContainerIdleEvent, and NonResponsiveConsumerEvent), when I get one of these, is human intervention needed to address the issue (like restarting the servers to have the consumers created again)? Thanks!

Hua
  • 666
  • 2
  • 9
  • 21

1 Answers1

1

You can emulate them all by injecting a Mock consumer factory into the container.

  • ConsumerStoppedEvent is emitted when you stop() the container.
  • ListenerContainerIdleEvent just means no records have been received in the idleEventInterval so it usually doesn't mean there's a problem.
  • NonResponsiveConsumerEvent - it's hard to say; with older clients the poll() would block if the server was down so we couldn't emit idle events (or do anything).

I don't know if you can still get them with more recent clients; but to simulate it you just need to block in the mock consumer poll() method for long enough for the monitor task to detect the problem and emit the event.

Gary Russell
  • 166,535
  • 14
  • 146
  • 179
  • Thanks @Gary Russell! I am running Spring Boot 2.1.0 and Spring Kafka 2.2.0. I have been checking my Kafka consumer logs from time to time in the lower environment. So far I have only seen NonResponsiveConsumerEvent in the logs. What I have noticed is that it does not seem to have any impact on the consumers, which are continuously fetching without any hesitation. The way I implemented it is that three consumers share one executor. A consumer checks the depth of the executor before it can submit a batch of messages. If the depth exceeds a threshold, the consumer has to wait... – Hua May 08 '19 at 06:28
  • The kafka offset of the batch is committed only after the batch is submitted to the executor. So it's most likely happening when a consumer has to wait an extended period of time. – Hua May 08 '19 at 06:33
  • First of all, you should try to keep up-to-date with releases; the current boot 2.1 version is 2.1.4, the current spring kafka version is 2.2.5. Second, if you suspend your listener threads like that, you need to be sure the consumer property `max.poll.interval.ms` is large enough so that kafka won't think the consumer has died. Finally, to suppress "spurious" non-responsive consumer events, you should ensure `pollTimeout * noPollThreshold` (container properties) is greater than the time you expect to suspend the thread. – Gary Russell May 08 '19 at 13:26
  • Thanks for the suggestions, @Gary Russell! I will modify the configurations accordingly! – Hua May 08 '19 at 19:12