1

i have two kafka topics containing different information to a "Warning Event". To know what entries in Topic A and B correspond to one another, i have to compare the serial, date and machine form the Key and then join on the 'no'

IMPORTNANT: The row_number can be different (see example below)

Topic A contains values like this:

key: {"serial":187,"date":"11/16/2022","row":0,"machine":"Blue"}

value: { "no": 1, "frequency": 0 }

Topic B contains value like this:

key: {"serial":187,"date":"11/16/2022","row":99,"machine":"Blue"}

value: { "warning": "Emergency", "no": 1 }

My desired output Topic C combines the two information through the 'no':

key: {"serial":187,"date":"11/16/2022","row":0,"machine":"Blue"}

value: { "no": 1, "frequency": 0, "warning": "Emergency!" }

It should be basically Topic A (with the same key but with the addition of the warning name (clear text) from Topic B.

I am a complete noob to Kafka so i have been struggling for a long time now on this problem.

I tried to use KStream.innerJoints but it is difficult to get the join when one part of the key (row_number) does not have to be the same.

klimzera
  • 11
  • 2

1 Answers1

0

To do a proper join, you would need to modify the key before the join, eg, via selectKey or map to remove the row number. If you don't need the row number in the output, it's simplest to use selecKey and just drop it. If you need the row number in the output, you would need to use map, and move it from the key into the value (you can put it back from the value into the key, using another map after the join).

KStream left = builder.stream(...).selectKey(...);
KStream right = builder.stream(...).selectKey(...);
KStream result = left.join(right, ...);
Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137