As explained in the comprehensive article Crossing the Streams. The Outer KStream-KStream Join emits element as soon as it arrives, even before waiting for its match in another K-Stream. Downside of this is that it duplicates not-joined event along with every joined event.
Can you suggest any alternate way to implement a join of events without duplicating(as in outer join) or missing(as in inner join)?
As per the same click-view events example:
KStream<String, JsonNode> joinedEventsStream =
clickEventsStream.outerJoin(viewEventsStream,
(clickEvent, viewEvent) -> processJoin(clickEvent, viewEvent),/* Fire quickly if match found,*/
/* else fire after 2 seconds */
JoinWindows.of(Duration.ofSeconds(2L)), StreamJoined.with(Serdes.String(), jsonSerde, jsonSerde)
);
Expected results are explained below:
- a click event arrives 1 sec after the view - Joined events (A,A)
- a click event arrives 11 sec after the view - Different events for each. Each one after 2 seconds(Window size) of its arrival.(B,null) (null,B)
- a view Event arrives 1 sec after the click - Joined events (C,C)
- there is a view event but no click - Not-joined event after 2 seconds of its arrival (D,null)
- there is a click event but no view - Not-joined event after 2 seconds of its arrival (null,E)