1

I have a Ktable to KTable join. I create the Ktables using .aggregate() Those yield results to the next stream processor when either side receives a new message. I have a use case where I can receive another message on the left KTable, but the message is a "duplicate". It's not an actual duplicate in the technical sense but it's a duplicate per my business logic (it contains X,Y and Z fields that have identical values to the previous message).

How can I check the previous aggregate value, compare it to the new value and stop that message from causing the join to yield results?

I also don't want to delete that key from the Ktable because I still need the right side Ktable to continue to join when new 'right side' messages come in.

I want to dynamically control when the join yields results. Is there something in the joiner I can do to check the previous state?

0x SLC
  • 143
  • 2
  • 10
  • 1
    To solve this, I used the answer provided here: https://stackoverflow.com/a/48253594/4354090 solution #1 - use a transform after the join and keep track of the previous state and return null if my business logic determines it's a 'repeated' message. – 0x SLC Aug 28 '20 at 16:04
  • 1
    There is WIP that change the current "emit on update" semantics to "emit on change" semantics, so the issue should go away in a future release: https://cwiki.apache.org/confluence/display/KAFKA/KIP-557%3A+Add+emit+on+change+support+for+Kafka+Streams -- for now, I guess there is much you can do, besides what it discussed in the question you linked to. – Matthias J. Sax Sep 02 '20 at 05:54

0 Answers0