1

We are currently developing a spring cloud stream application using kafka streams. (The problem seems to be not specific to spring cloud streams but to kafka streams)

Our processor needs to transform events from an incoming kstream and produces two streams. One stream will generate exactly one message (command) and the other stream will generate multiple events based on the incoming event. The transformation that produces the two outputs is heavy so creating two processors seems to be inefficient.

Our current approach is roughly:

  • Define a functional bean Function<KStream<String, String>, KStream<String, String>[]
  • flatmap the input events to a KStream<String, Object>
  • branch based an predicates using instance of
@Bean
public Function<KStream<String, InputEvent>, KStream<String, Object>[]> processor() {

 Transformer<String, InputEvent, Iterable<KeyValue<String,Object>>> transformer = new Transformer<String, InputEvent, Iterable<KeyValue<String,Object>>>() {
   @Override
   public void init(ProcessorContext ctx) {}

   @Override
   public Iterable<KeyValue<String,Object>> transform(String key, InputEvent value)
   {
     List<OutputEvent> events = ... // BUSINESS_LOGIC
     Command command = ... // BUSINESS_LOGIC

     List<Object> result = new ArrayList<>();
     result.addAll(events);
     result.add(command);
     return result;
   }

   @Override
   public void close() {}
 }

 Predicate<String, Object> commandStream = (k, v) -> v instanceof OutputEvent;
 Predicate<String, Object> eventStream = (k, v) -> v instanceof Command;

 return input -> input.transform(() -> transformer)
                      .branch(eventStream, commandStream); // in the real application the result streams are serialized within the function so we have KStream<String, String>[] as a result

Is there a way to branch the kstream in a different way, so that we gain type safety and get rid of the "instanceof" casting?

One approach could be to produce only the event-stream and using a KafkaTemplate within the transformer to publish the command but this seems to break the overall architecture of spring cloud stream. Also we don't know anything about the transactionality of that approach.

  • I can't speak to the core question, but I can state that a KafkaTemplate used that way will NOT participate in any transaction managed by KafkaStreams. – Gary Russell Mar 04 '21 at 16:10
  • What is wrong with doing the `instanceof` check there? I think that is a valid predicate. – sobychacko Mar 04 '21 at 17:48
  • The `instanceof`is valid but it shows that we lose all type safety that we might have during the stream processing. Is there a possibility to use a single transformer to generate completely different types and still have type safety? –  Mar 04 '21 at 19:09
  • @sobychacko I have problem with output `KStream[]` that Spring Cloud Stream does not "inject" proper Serde basing on signature - because signature is Object handled by default by ByteArraySerializer (I use KafkaProtobufSerde). – michaldo Dec 26 '22 at 01:40

0 Answers0