Handle partially failed batch with spring cloud streams

Question

When using spring-cloud-stream for streaming application (functional style) with batches, is there a way to retry/DLQ a failed message but also process (stream) the non-failing records?

for example: function received batch of 10 records, and attempts to convert them to other type and return the new records for producing. let's say record 8 failed on mapping, is it possible to complete the producing of records 0-7 and then retry/DLQ record 8?

throwing BatchListenerFailedException with the index does not cause the prior messages to be sent.

spring kafka version: 2.8.0

code:

        @Override
public List<Message<Context>> apply(Message<List<Context>> listMessage) {

    List<Message<Context>> output = new ArrayList<>();

    IntStream.range(0, listMessage.getPayload().size()).forEach(index -> {
        try {

            Record<Context> record = Record.fromBatch(listMessage, index);
            output.add(MessageBuilder.withPayload(record.getValue()).build());
            if (index == listMessage.getPayload().size() - 1) {
                throw new TransientError("offset " + record.getOffset() + "failed", new RuntimeException());
            }
        } catch (Exception e) {
          
            throw new BatchListenerFailedException("Tigger retry", e, index);
        }
    });

    return output;
}

customizer:

private CommonErrorHandler getCommonErrorHandler(String group) {
        DefaultErrorHandler errorHandler = new DefaultErrorHandler(getRecoverer(group), getBackOff());
        errorHandler.setLogLevel(KafkaException.Level.DEBUG);
        errorHandler.setAckAfterHandle(true);
        errorHandler.setClassifications(Map.of(
                        PermanentError.class, false,
                        TransientError.class, true,
                        SerializationException.class, properties.isRetryDesErrors()),
                false);
        errorHandler.setRetryListeners(getRetryListener());
        return errorHandler;
    }

    private ConsumerRecordRecoverer getRecoverer(String group) {
        KafkaOperations<?, ?> operations = new KafkaTemplate<>(new DefaultKafkaProducerFactory<>(producerProperties()));
        DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(
                operations, getDestinationResolver(group));
        recoverer.setHeadersFunction(this::buildAdditionalHeaders);
        return recoverer;
    }

yaml:

spring:
  cloud:
    function:
      definition: batchFunc
    stream:
      default-binder: kafka-string-avro
      binders:
        kafka-string-avro:
          type: kafka
          environment.spring.cloud.stream.kafka.binder.consumerProperties:
            key.deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
            value.deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
            spring.deserializer.key.delegate.class: org.apache.kafka.common.serialization.StringDeserializer
            spring.deserializer.value.delegate.class: io.confluent.kafka.serializers.KafkaAvroDeserializer
            schema.registry.url: ${SCHEMA_REGISTRY_URL:http://localhost:8081}
            value.subject.name.strategy: io.confluent.kafka.serializers.subject.TopicNameStrategy
            specific.avro.reader: true
          environment.spring.cloud.stream.kafka.binder.producerProperties:
            key.serializer: org.apache.kafka.common.serialization.StringSerializer
            value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
            schema.registry.url: ${SCHEMA_REGISTRY_URL:http://localhost:8081}
            value.subject.name.strategy: io.confluent.kafka.serializers.subject.TopicNameStrategy
      bindings:
        batchFunc-in-0:
          binder: kafka-string-avro
          destination: records.publisher.output
          group: function2-in-group
          contentType: application/*+avro
          consumer:
            batchMode: true
        function2-out-0:
          binder: kafka-string-avro
          destination: reporter.output
          producer:
            useNativeEncoding: true
      kafka:
        binder:
          brokers: ${KAFKA_BROKER_ADDRESS:localhost:9092}
          autoCreateTopics: ${KAFKA_AUTOCREATE_TOPICS:false}
        default:
          consumer:
            startOffset: ${START_OFFSET:latest}
            enableDlq: false
      default:
        consumer:
          maxAttempts: 1
          defaultRetryable: false

`BatchListenerFailedException` should work as expected; edit the question to show which version you are using as well as your code and configuration. — Gary Russell, Jun 14 '22 at 13:04
thanks @GaryRussell, I added the customizer and yaml definitions. the BatchListenerFailedException works great, my question is how to publish the rest of the batch since I cannot return it from the function when I throw. — Yosi Bronsberg, Jun 14 '22 at 13:44
I don't understand the question; the BLEF contains information that record 7 failed and it is retried or DLQ'd; the remaining records (8-10) have seeks performed so they will be redelivered in the next batch. — Gary Russell, Jun 14 '22 at 14:11
Right, but the records prior to record 7 are not published. the error handler commits the previous offset and seeks forward. as shown in the logs, the "failed message" is the whole batch. — Yosi Bronsberg, Jun 14 '22 at 15:55
It is still not clear what you are trying to achieve; if 1-6 were successfully processed, why do you want to send them to the DLT? — Gary Russell, Jun 14 '22 at 16:02
The application is a streamer - consume from one topic and publish to another topic. so 1-6 are supposed to be published to the function's out binding, but they are not. I fixed the code example - in this case, the consumer offset is always progressing, yet no successful message is published. — Yosi Bronsberg, Jun 14 '22 at 16:05
Oh - no; you can't do that; instead of `Function, ?>` use a `Consumer>` and publish the output records using the `StreamBridge`. — Gary Russell, Jun 14 '22 at 16:07
That was exactly the purpose of my question, so I'm glad to hear that it cannot be done with functions. I believe that it worth a note in the documentation. thanks a lot Garry! — Yosi Bronsberg, Jun 14 '22 at 16:09
In any case, you are accumulating the output records in a list and throwing an exception, discarding the list. — Gary Russell, Jun 14 '22 at 16:10

Handle partially failed batch with spring cloud streams

0 Answers0