1

In my application, I set up Kafka as following.

@Configuration
public class KafkaConfiguration {
    @Bean
    public ConcurrentKafkaListenerContainerFactory<Object, Object> kafkaListenerContainerFactory(ConcurrentKafkaListenerContainerFactoryConfigurer configurer,
                                                                                                 ConsumerFactory<Object, Object> consumerFactory,
                                                                                                 KafkaTransactionManager<?, ?> kafkaTransactionManager,
                                                                                                 KafkaTemplate<Object, Object> template) {
        ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
        configurer.configure(factory, consumerFactory);
        
        // Send all failed message from topic XYZ to another dead-letter topic with .DLT suffix (i.e. XYZ.DLT)
        factory.getContainerProperties().setTransactionManager(kafkaTransactionManager);
        factory.setAfterRollbackProcessor(new DefaultAfterRollbackProcessor<>(new DeadLetterPublishingRecoverer(template), new FixedBackOff(50, 1)));

        return factory;
    }

    @Bean
    public RecordMessageConverter converter() {
      return new StringJsonMessageConverter();
    }
}

Below is what I have in application.yml

spring:
  kafka:
    admin:
      fail-fast: true
    consumer:
      bootstrap-servers: 127.0.0.1:9092
      group-id: wfo
      enable-auto-commit: false
      auto-offset-reset: latest
      key-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer2
      value-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer2
      properties:
        spring.deserializer.key.delegate.class: org.apache.kafka.common.serialization.StringDeserializer
        spring.deserializer.value.delegate.class: org.apache.kafka.common.serialization.StringDeserializer
        isolation.level: read_committed
    producer:
      bootstrap-servers: 127.0.0.1:9092
      transaction-id-prefix: tx.
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer

To test listeners on transaction failure, I created the following class.

@Service
public class EventListener {
    @KafkaListener(topics="test_topic")
    public void listen(TestEvent event) {
        System.out.println("RECEIVED EVENT: " + event.getPayload());

        if (event.getPayload().contains("fail"))
            throw new RuntimeException("TEST TRANSACTION FAILED");
    }
}

When I publish a TestEvent, I can see the payload printed on the console. When I include the word fail in my payload, a RuntimeException is thrown and I see a Transaction rolled back error message on the console.

However, after retry failed for about a minute, I saw the following exception on the console.

2020-06-20 17:07:46,326 ERROR [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] o.s.kafka.support.LoggingProducerListener   : Exception thrown when sending a message with key='null' and payload='{"customerCode":"DVTPRDFT411","payload":"MSGfail 90"}' to topic test_topic.DLT and partition 2:
org.apache.kafka.common.errors.TimeoutException: Topic test_topic.DLT not present in metadata after 60000 ms.
2020-06-20 17:07:46,327 ERROR [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] o.s.kafka.listener.DeadLetterPublishingRecoverer   : Dead-letter publication failed for: ProducerRecord(topic=test_topic.DLT, partition=2, headers=RecordHeaders(headers = [RecordHeader(key = __TypeId__, value = [99, 111, 109, 46, 102, 116, 46, 100, 101, 109, 111, 46, 100, 116, 111, 46, 100, 101, 98, 117, 103, 46, 75, 97, 102, 107, 97, 69, 118, 101, 110, 116, 82, 101, 113, 117, 101, 115, 116, 36, 71, 101, 110, 101, 114, 105, 99, 69, 118, 101, 110, 116]), RecordHeader(key = kafka_dlt-original-topic, value = [116, 101, 115, 116, 95, 116, 111, 112, 105, 99]), RecordHeader(key = kafka_dlt-original-partition, value = [0, 0, 0, 2]), RecordHeader(key = kafka_dlt-original-offset, value = [0, 0, 0, 0, 0, 0, 0, 0]), RecordHeader(key = kafka_dlt-original-timestamp, value = [0, 0, 1, 114, -48, -29, -90, 47]), RecordHeader(key = kafka_dlt-original-timestamp-type, value = [67, 114, 101, 97, 116, 101, 84, 105, 109, 101]), RecordHeader(key = kafka_dlt-exception-fqcn, value = [111, 114, 103, 46, 115, 112, 114, 105, 110, 103, 102, 114, 97, 109, 101, 119, 111, 114, 107, 46, 107, 97, 102, 107, 97, 46, 108, 105, 115, 116, 101, 110, 101, 114, 46, 76, 105, 115, 116, 101, 110, 101, 114, 69, 120, 101, 99, 117, 116, 105, 111, 110, 70, 97, 105, 108, 101, 100, 69, 120, 99, 101, 112, 116, 105, 111, 110]), RecordHeader(key = kafka_dlt-exception-message, value = [76, 105, 115, 116, 101, 110, 101, 114, 32, 109, 101, 116, 104, 111, 100, 32, 39, 112, 117, 98, 108, 105, 99, 32, 118, 111, 105, 100, 32, 99, 111, 109, 46, 102, 116, 46, 100, 101, 109, 111, 46, 115, 101, 114, 118, 105, 99, 101, 46, 69, 118, 101, 110, 116, 76, 105, 115, 116, 101, 110, 101, 114, 46, 108, 105, 115, 116, 101, 110, 40, 99, 111, 109, 46, 102, 116, 46, 101, 118, 101, 110, 116, 46, 109, 97, 110, 97, 103, 101, 109, 101, 110, 116, 46, 100, 101, 102, 105, 110, 105, 116, 105, 111, 110, 46, 84, 101, 115, 116, 69, 118, 101, 110, 116, 41, 39, 32, 116, 104, 114, 101, 119, 32, 101, 120, 99, 101, 112, 116, 105, 111, 110, 59, 32, 110, 101, 115, 116, 101, 100, 32, 101, 120, 99, 101, 112, 116, 105, 111, 110, 32, 105, 115, 32, 106, 97, 118, 97, 46, 108, 97, 110, 103, 46, 82, 117, 110, 116, 105, 109, 101, 69, 120, 99, 101, 112, 116, 105, 111, 110, 58, 32, 84, 69, 83, 84, 32, 84, 82, 65, 78, 83, 65, 67, 84, 73, 79, 78, 32, 70, 65, 73, 76, 69, 68, 59, 32, 110, 101, 115, 116, 101, 100, 32, 101, 120, 99, 101, 112, 116, 105, 111, 110, 32, 105, 115, 32, 106, 97, 118, 97, 46, 108, 97, 110, 103, 46, 82, 117, 110, 116, 105, 109, 101, 69, 120, 99, 101, 112, 116, 105, 111, 110, 58, 32, 84, 69, 83, 84, 32, 84, 82, 65, 78, 83, 65, 67, 84, 73, 79, 78, 32, 70, 65, 73, 76, 69, 68]), RecordHeader(key = kafka_dlt-exception-stacktrace, value = [111, 100]), RecordHeader(key = b3, value = [100, 57, 51, 100, 56, 97, 98, 57, 55, 54, 97, 100, 54, 102, 49, 100, 45, 100, 57, 51, 100, 56, 97, 98, 57, 55, 54, 97, 100, 54, 102, 49, 100, 45, 49])], isReadOnly = false), key=null, value={"customerCode":"DVTPRDFT411","payload":"MSGfail 90"}, timestamp=null)
org.springframework.kafka.KafkaException: Send failed; nested exception is org.apache.kafka.common.errors.TimeoutException: Topic test_topic.DLT not present in metadata after 60000 ms.
    at org.springframework.kafka.core.KafkaTemplate.doSend(KafkaTemplate.java:570)
    at org.springframework.kafka.core.KafkaTemplate.send(KafkaTemplate.java:385)
    at org.springframework.kafka.listener.DeadLetterPublishingRecoverer.publish(DeadLetterPublishingRecoverer.java:278)
    at org.springframework.kafka.listener.DeadLetterPublishingRecoverer.lambda$accept$3(DeadLetterPublishingRecoverer.java:209)
    at org.springframework.kafka.core.KafkaTemplate.executeInTransaction(KafkaTemplate.java:463)
    at org.springframework.kafka.listener.DeadLetterPublishingRecoverer.accept(DeadLetterPublishingRecoverer.java:208)
    at org.springframework.kafka.listener.DeadLetterPublishingRecoverer.accept(DeadLetterPublishingRecoverer.java:54)
    at org.springframework.kafka.listener.FailedRecordTracker.skip(FailedRecordTracker.java:106)
    at org.springframework.kafka.listener.SeekUtils.lambda$doSeeks$2(SeekUtils.java:84)
    at java.util.ArrayList.forEach(Unknown Source)
    at org.springframework.kafka.listener.SeekUtils.doSeeks(SeekUtils.java:81)
    at org.springframework.kafka.listener.DefaultAfterRollbackProcessor.process(DefaultAfterRollbackProcessor.java:102)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.recordAfterRollback(KafkaMessageListenerContainer.java:1700)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeRecordListenerInTx(KafkaMessageListenerContainer.java:1662)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeRecordListener(KafkaMessageListenerContainer.java:1614)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeListener(KafkaMessageListenerContainer.java:1348)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1064)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:972)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

For some reasons, the code couldn't publish events to the DLT topic. I tried to edit broker configurations to include listeners=PLAINTEXT://:9092 but it didn't help.

I'd be very grateful if you could point me in some direction to resolve this issue.

**UPDATE:

I do see the DLT topic when I execute the kafka-topics --list --zookeeper localhost:2181 command.

enter image description here

By the way, this might be a bug from the latest Spring Kafka version. I downgraded to Kafka v2.4.1 and Spring Boot v2.2.7 and it's working fine. I reported the bug here.

https://github.com/spring-projects/spring-kafka/issues/1516

If there's some new configurations to be done in Kafka v2.5.0 to make this work, please let me know.

Mr.J4mes
  • 9,168
  • 9
  • 48
  • 90

1 Answers1

2

org.apache.kafka.common.errors.TimeoutException: Topic test_topic.DLT not present in metadata after 60000 ms.

From the error, it seems that you do not have the test_topic.DLT.
For each topic you created, example XXX, you need to create its corresponding XXX.DLT topic.

As a side-note, ErrorHandlingDeserializer2 is deprecated since 2.5.

UPDATE:
When you list topics kafka-topics.bat --bootstrap-server 127.0.0.1:9092 --list, are you able to see test_topic.DLT? And when you describe it kafka-topics.bat --bootstrap-server 127.0.0.1:9092 --describe --topic test_topic.DLT, can you see the same number of partitions as the original topic, more precisely can you see partition 2?

As per the Spring Kafka Documentation:

By default, the dead-letter record is sent to a topic named .DLT (the original topic name suffixed with .DLT) and to the same partition as the original record.

jumping_monkey
  • 5,941
  • 2
  • 43
  • 58
  • Thanks for the suggestion. I created the topic DLT but it still doesn't work. Besides, `allow.auto.create.topics = true` – Mr.J4mes Jun 20 '20 at 10:52
  • When you list topics `kafka-topics.bat --bootstrap-server 127.0.0.1:9092 --list`, are you able to see `test_topic.DLT`? And when you describe it `kafka-topics.bat --bootstrap-server 127.0.0.1:9092 --describe --topic test_topic.DLT`, can you see the same number of partitions as the original topic, more precisely can you see partition 2? – jumping_monkey Jun 20 '20 at 11:42
  • 1
    awesome question man. I ran the command and what I found is that the DLT topic only has 1 partition but the message is being pushed to the same partition number as that of the original topic. Meaning when a message comes from partition 4, it will also be pushed to the DLT topic at partition 4. This did not happen with previous versions. Maybe they changed the logic somewhere – Mr.J4mes Jun 20 '20 at 12:46
  • Great, that was the reason of my question and why `allow.auto.create.topics = true` is not recommended i.e if you do not have a need to have topics created on the fly(although i am not sure if that is the cause of your issue, since your use case was successful in `v.2.2.7.RELEASE`). At least now, we know where lies the issue. As a side-note, i just replicated your issue on Spring Boot `v2.3.0.RELEASE`, with slightly different configuration than yours, and the record is being published correctly on my DLT, when i `throw new RuntimeException`. Strange. – jumping_monkey Jun 20 '20 at 13:07
  • I included a link to the report I submitted to Spring Kafka on github in my post. If you want you can download the zip file containing the test application. I could reproduce the issue consistently when I switch between the 2 versions. If you have time, maybe you can help me identify the differences with your test version and we both can learn why it worked for you in `v2.3.0.RELEASE`. – Mr.J4mes Jun 20 '20 at 14:55
  • 1
    There is no difference between 2.5.x and earlier versions - the DLPR has always published to the same partition as the original message, by default. You can change that behavior by configuring a destination resolver. See [the documentation](https://docs.spring.io/spring-kafka/docs/2.5.2.RELEASE/reference/html/#dead-letters). `>Therefore, when you use the default resolver, the dead-letter topic must have at least as many partitions as the original topic.` – Gary Russell Jun 20 '20 at 15:12