0

I have a Springboot app configured with spring-kafka where I want to handle all sorts of error that can happen while listening to a topic. If any message is missed / not able to be consumed because of either Deserialization or any other Exception, there will be 2 retries and after which the message should be logged to an error file. I have two approaches that can be followed :-

First Approach( Using SeekToCurrentErrorHandler with DeadLetterPublishingRecoverer):-

@Autowired
KafkaTemplate<String,Object> template;

@Bean(name = "kafkaSourceProvider")
public ConcurrentKafkaListenerContainerFactory<K, V> consumerFactory() {
        Map<String, Object> config = appProperties.getSource()
                .getProperties();
        ConcurrentKafkaListenerContainerFactory<K, V> factory =
                new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(config));

        DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(template,
                (r, e) -> {
                    if (e instanceof FooException) {
                        return new TopicPartition(r.topic() + ".DLT", r.partition());
                    }
                });
        ErrorHandler errorHandler = new SeekToCurrentErrorHandler(recoverer, new FixedBackOff(0L, 2L));

        factory.setErrorHandler(errorHandler);
        return factory;
    }

But for this we require addition topic(a new .DLT topic) and then we can log it to a file.

@Bean
    public KafkaAdmin admin() {
        Map<String, Object> configs = new HashMap<>();
        configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
                StringUtils.arrayToCommaDelimitedString(kafkaEmbedded().getBrokerAddresses()));
        return new KafkaAdmin(configs);
    }
    
@KafkaListener( topics = MY_TOPIC + ".DLT", groupId = MY_ID)
public void listenDlt(ConsumerRecord<String, SomeClassName> consumerRecord,
    @Header(KafkaHeaders.DLT_EXCEPTION_STACKTRACE) String exceptionStackTrace) {

    logger.error(exceptionStackTrace);
}

Approach 2 ( Using custom SeekToCurrentErrorHandler) :-

@Bean
    public ConcurrentKafkaListenerContainerFactory<K, V> consumerFactory() {
        Map<String, Object> config = appProperties.getSource()
                .getProperties();
        ConcurrentKafkaListenerContainerFactory<K, V> factory = new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(config));
        
        factory.setErrorHandler(new CustomSeekToCurrentErrorHandler());
        factory.setRetryTemplate(retryTemplate());
        return factory;
    }

private RetryTemplate retryTemplate() {
    RetryTemplate retryTemplate = new RetryTemplate();
    retryTemplate.setBackOffPolicy(backOffPolicy());
    retryTemplate.setRetryPolicy(aSimpleReturnPolicy);
}

public class CustomSeekToCurrentErrorHandler extends SeekToCurrentErrorHandler {

private static final int MAX_RETRY_ATTEMPTS = 2;

CustomSeekToCurrentErrorHandler() {
    super(MAX_RETRY_ATTEMPTS);
}

@Override
public void handle(Exception exception, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, MessageListenerContainer container) {
    try {
        if (!records.isEmpty()) {
            log.warn("Exception: {} occurred with message: {}", exception, exception.getMessage());
            
            super.handle(exception, records, consumer, container);
        }
    } catch (SerializationException e) {
        log.warn("Exception: {} occurred with message: {}", e, e.getMessage());
    }
}

}

Can anyone provide their suggestions on what's the standard way to implement this kind of feature. In first approach we do see an overhead of creation of .DLT topics and an additional @KafkaListener. In second approach, we can directly log our consumer record exception.

user2594
  • 386
  • 1
  • 5
  • 18

2 Answers2

5

With the first approach, it is not necessary to use a DeadLetterPublishingRecoverer, you can use any ConsumerRecordRecoverer that you want; in fact the default recoverer simply logs the failed message.

/**
 * Construct an instance with the default recoverer which simply logs the record after
 * the backOff returns STOP for a topic/partition/offset.
 * @param backOff the {@link BackOff}.
 * @since 2.3
 */
public SeekToCurrentErrorHandler(BackOff backOff) {
    this(null, backOff);
}

And, in the FailedRecordTracker...

if (recoverer == null) {
    this.recoverer = (rec, thr) -> {
        
        ...

        logger.error(thr, "Backoff "
            + (failedRecord == null
                ? "none"
                : failedRecord.getBackOffExecution())
            + " exhausted for " + ListenerUtils.recordToString(rec));
    };
}

Backoff (and a limit to retries) was added to the error handler after adding retry in the listener adapter, so it's "newer" (and preferred).

Also, using in-memory retry can cause issues with rebalancing if long BackOffs are employed.

Finally, only the SeekToCurrentErrorHandler can deal with deserialization problems (via the ErrorHandlingDeserializer).

EDIT

Use the ErrorHandlingDeserializer together with a SeekToCurrentErrorHandler. Deserialization exceptions are considered fatal and the recoverer is called immediately.

See the documentation.

Here is a simple Spring Boot application that demonstrates it:

public class So63236346Application {


    private static final Logger log = LoggerFactory.getLogger(So63236346Application.class);


    public static void main(String[] args) {
        SpringApplication.run(So63236346Application.class, args);
    }

    @Bean
    public NewTopic topic() {
        return TopicBuilder.name("so63236346").partitions(1).replicas(1).build();
    }

    @Bean
    ErrorHandler errorHandler() {
        return new SeekToCurrentErrorHandler((rec, ex) -> log.error(ListenerUtils.recordToString(rec, true) + "\n"
                + ex.getMessage()));
    }

    @KafkaListener(id = "so63236346", topics = "so63236346")
    public void listen(String in) {
        System.out.println(in);
    }

    @Bean
    public ApplicationRunner runner(KafkaTemplate<String, String> template) {
        return args -> {
            template.send("so63236346", "{\"field\":\"value1\"}");
            template.send("so63236346", "junk");
            template.send("so63236346", "{\"field\":\"value2\"}");
        };
    }

}
package com.example.demo;

public class Thing {

    private String field;

    public Thing() {
    }

    public Thing(String field) {
        this.field = field;
    }

    public String getField() {
        return this.field;
    }

    public void setField(String field) {
        this.field = field;
    }

    @Override
    public String toString() {
        return "Thing [field=" + this.field + "]";
    }

}
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.kafka.consumer.properties.spring.deserializer.value.delegate.class=org.springframework.kafka.support.serializer.JsonDeserializer
spring.kafka.consumer.properties.spring.json.value.default.type=com.example.demo.Thing

Result

Thing [field=value1]
2020-08-10 14:30:14.780 ERROR 78857 --- [o63236346-0-C-1] com.example.demo.So63236346Application   : so63236346-0@7
Listener failed; nested exception is org.springframework.kafka.support.serializer.DeserializationException: failed to deserialize; nested exception is org.apache.kafka.common.errors.SerializationException: Can't deserialize data [[106, 117, 110, 107]] from topic [so63236346]
2020-08-10 14:30:14.782  INFO 78857 --- [o63236346-0-C-1] o.a.k.clients.consumer.KafkaConsumer     : [Consumer clientId=consumer-so63236346-1, groupId=so63236346] Seeking to offset 8 for partition so63236346-0
Thing [field=value2]
Gary Russell
  • 166,535
  • 14
  • 146
  • 179
  • 1
    Thanks @GaryRussell for your reply. What if there are other exceptions too like TimeoutException. Forget the retries, Need to just log any exception that may occur so do you think only SeekToCurrentErrorHandler will suffice ? Though I understand that exceptions are considered as non-retryable so do backoff policy has to be considered now ? – user2594 Aug 03 '20 at 21:07
  • 1
    You can specify other non-retryable exceptions to the STCEH via the `addNotRetryableException()` - for such exceptions, retries are skipped and the recoverer is called on the first failure. If you don't want to retry **any** exceptions, then add a simple logging error hander. – Gary Russell Aug 03 '20 at 21:12
  • Thanks for your inputs. Is there a way to skip these records in next poll since Anything which is handled in STCEH will come in next poll. Do you have any example of simple logging error handler? – user2594 Aug 03 '20 at 21:18
  • 1
    Then @Gary Russell, I think KafkaListenerErrorHandler will suffice if we dont have to make any retries – user2594 Aug 03 '20 at 21:41
  • It is not clear what you mean; a "recovered" record will not be delivered again. See `LoggingErrorHandler`. The `KafkaListenerErrorHandler` is at a different level; it has access to the converted `Message>`; the container error handler only gets the raw `ConsumerRecord`s. The `KafkaListenerErrorHandler` is mainly used if a request/reply listener wants to return some meaningful error as a reply instead of throwning an exception to the container. – Gary Russell Aug 03 '20 at 21:47
  • Thanks @Gary Russell for the insights. I have come to the conclusion that there can be only one error handler per container and either I can use STCEH with no back-off or custom KafkaListenerErrorHandler with KafkaListener method for just logging purposes if any exception occurs – user2594 Aug 04 '20 at 12:30
  • In the above example, the backoff used with STCEH is default value (10). Any suggestions on if I do not want any backoff/retries – user2594 Aug 10 '20 at 20:10
  • 1
    The default back-off is 0 delay with 9 retries (10 delivery attempts total) but `DeserializationExceptions` won't get retried (see `addNotRetryableException()` for default not-retryable exceptions). However, you can just add `new FixedBackOff(0L, 0L)` to never retry for any exception. – Gary Russell Aug 10 '20 at 20:19
  • Thanks @Gary Russell for the example. Earlier, a lot of confusion around using ErrorHandlingDeserializer was there with STCEH but now things are working fine. Thanks a ton – user2594 Aug 10 '20 at 21:55
0

The expectation was to log any exception that we might get at the container level as well as the listener level.

Without retrying, following is the way I have done error handling:-

If we encounter any exception at the container level, we should be able to log the message payload with the error description and seek that offset and skip it and go ahead receiving the next offset. Though it is done only for DeserializationException, the rest of the exceptions also needs to be seek and offsets needs to be skipped for them.

@Component
public class KafkaContainerErrorHandler implements ErrorHandler {

    private static final Logger logger = LoggerFactory.getLogger(KafkaContainerErrorHandler.class);

    @Override
    public void handle(Exception thrownException, List<ConsumerRecord<?, ?>> records, Consumer<?, ?> consumer, MessageListenerContainer container) {
        String s = thrownException.getMessage().split("Error deserializing key/value for partition ")[1].split(". If needed, please seek past the record to continue consumption.")[0];

        // modify below logic according to your topic nomenclature
        String topics = s.substring(0, s.lastIndexOf('-'));
        int offset = Integer.parseInt(s.split("offset ")[1]);
        int partition = Integer.parseInt(s.substring(s.lastIndexOf('-') + 1).split(" at")[0]);

        logger.error("...")
        TopicPartition topicPartition = new TopicPartition(topics, partition);
        logger.info("Skipping {} - {} offset {}",  topics, partition, offset);
        consumer.seek(topicPartition, offset + 1);
    }

    @Override
    public void handle(Exception e, ConsumerRecord<?, ?> consumerRecord) {

    }
}


 factory.setErrorHandler(kafkaContainerErrorHandler);

If we get any exception at the @KafkaListener level, then I am configuring my listener with my custom error handler and logging the exception with the message as can be seen below:-

@Bean("customErrorHandler")
    public KafkaListenerErrorHandler listenerErrorHandler() {
        return (m, e) -> {
            logger.error(...);
            return m;
        };
    }
user2594
  • 386
  • 1
  • 5
  • 18
  • This will not work - you will lose the remaining records in the list - you need to perform seeks on those partitions too so they will be refetched on the next poll. – Gary Russell Aug 10 '20 at 14:45
  • Are you referring to the customErrorHandler because I am seeking on partitions at the container level and skipping those offset for which exception has occurred. – user2594 Aug 10 '20 at 15:17
  • Yes; I only see 1 seek (skip the deserialization exception); the remaining records in the list (if any) need to have seeks performed (to the lowest offset for each partition) for the remaining records. This is exactly what the `STCEH` does so it's not clear why you need a custom error handler at all. – Gary Russell Aug 10 '20 at 16:06
  • I tried with STCEH with new SeekToCurrentErrorHandler((record, exception) -> { logger.error("..." ); }, 1); but it is running in infinite loop. I tried setting the concurrency of the container also to 1 but still no success. And when I dont want to have any retry attempt , I tried with backoff value as 0 but still for any exceptions it is running in infinite loop. Any suggestions here @Gary Russell – user2594 Aug 10 '20 at 17:41
  • I see you are missing the `ErrorHandlingDeserializer` - I'll add an example to my answer. Parsing the topic/partition from the exception is likely a bit brittle. – Gary Russell Aug 10 '20 at 18:18
  • See the new example in my answer. – Gary Russell Aug 10 '20 at 18:36
  • Thank you @Gary Russell for the detailed answer. – user2594 Aug 10 '20 at 19:58