The KStreams - KTable
join works in a very simple way: every time a new sample is emitted on the stream, a lookup by key is performed on the table.
Can this yield to unexpected behaviour in transient phases? We have a topology like so:
- One
KStream
A where we perform a selectKey turning it into a Stream A1 - One
KStream
B which we groupBy and then reduce, turning it into a KTable B1
At startup, we publish two records on A and two records on B, so that after the selectKey on A and the groupBy + reduce on B the key will match. However, we notice that sometimes the samples that the inner join between A1 and B1 fails, and we lose instead some output which we expect .
What is the right topology to ensure no updates get lost?