-1

I am implementing Spring Boot application in Java, using Spring Cloud Stream with Kafka Streams binder.

I need to implement blocking operation inside of KStream map method like so:

public Consumer<KStream<?, ?>> sink() {
    return input -> input
        .mapValues(value -> methodReturningCompletableFuture(value).get())
        .foreach((key, value) -> otherMethod(key, value));
}

completableFuture.get() throws exceptions (InterruptedException, ExecutionException)

How to handle these exceptions so that the chained method doesn't get executed and the Kafka message is not acknowledged? I cannot afford message loss, sending it to a dead letter topic is not an option.

Is there a better way of blocking inside map()?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
alkazap
  • 3
  • 2

1 Answers1

0

You can try the branching feature in Kafka Streams to control the execution of the chained methods. For example, here is a pseudo-code that you can try. You can possibly use this as a starting point and adapt this to your particular use case.

final Map<String, ? extends KStream<?, String>> branches = 
input.split()
     .branch(k, v) -> {
        try {
          methodReturningCompletableFuture(value).get();
          return true;
        }
        catch (Exception e) {
          return false;
        }
      }, Branched.as("good-records"))
      .defaultBranch();

final KStream<?, String> kStream = branches.get("good-records");

 kStream.foreach((key, value) -> otherMethod(key, value));

The idea here is that you will only send the records that didn't throw an exception to the named branch good-records, everything else goes into a default branch which we simply ignore in this pseudo-code. Then you invoke additional chained methods (as this foreach call shows) only for those "good" records.

This does not solve the problem of not acknowledging the message after an exception is thrown. That seems to be a bit challenging. However, I am curious about that use case. When an exception happens and you handle it, why don't you want to ack the message? The requirements seem to be a bit rigid without using a DLT. The ideal solution here is that you might want to introduce some retries and once exhausted from the retries, send the record to a DLT which makes Kafka Streams consumer acknowledges the message. Then the application moves on to the next offset.

The call methodReturningCompletableFuture(value).get() simply waits until a default or configured timeout is reached, assuming that methodReturningCompletableFuture() returns a Future object. Therefore, that is already a good approach to wait inside the KStream map operation. I don't think anything else is necessary to make it wait further.

sobychacko
  • 5,099
  • 15
  • 26
  • Thank you for the answer, I was considering retrying, tho I'm not sure what would be a reasonable number of retries. My use case is audio transcription, Kafka stream contains audio chunks, and I transcribe them using Vosk API. While it's possible that the cause of the error is a faulty message, it could also be a run-time error. I wonder, if I use RetryTemplate, will the message processing of messages that come after the one that's being retried be stalled? Is there no way of manual acknowledgment in KStream? – alkazap Feb 12 '22 at 01:24
  • While the reties are happening, the messages that come after the one that is currently being retried simply wait and Kafka Streams delivers the message to your stream once the stream thread that is doing the retrying is freed. Kafka Streams processes records on a per-record basis. It sounds like ordering matters for the audio chunks, if that's not the case, then you can increase concurrency (number of stream threads) to handle chunks concurrently. – sobychacko Feb 12 '22 at 03:40
  • There are ways to access the consumer that is used by Kafka Streams and do a manual acknowledgment, but that is too low-level and might involve the use of reflection (not sure about that, but I remember that in the past some fields were private and thus you end up using some reflection to access the consumer and then invoke manual ack). – sobychacko Feb 12 '22 at 03:42
  • Exactly, ordering matters, so I wanted to make sure that retires will stall the processing of newer messages. Thank you so much for your help! – alkazap Feb 12 '22 at 03:57