I have a use-case in which I am receiving the tweets on a topic, and user-details on other topic. I need to find username from the user-details and set it to tweets. Using following code I am able to get the expected outcome.
KStream<String, Tweet> tweetStream = builder
.stream("tweet-topic",
Consumed.with(Serdes.String(),
serdeProvider.getTweetSerde()));
KTable<String, User> userTable = builder.table("user-topic",
Consumed.with(Serdes.String(),
serdeProvider.getUserSerde()));
KStream<String, Tweet> finalStream = tweetStream.leftJoin(userTable, (tweetDetail, userDetail) -> {
if (userDetail != null) {
return tweetDetail.setUserName(userDetail.getName());
}
return tweetDetail;
}, Joined.with(Serdes.String(), serdeProvider.getTweetSerde(),
serdeProvider.getUserSerde()));
However, if there are 1000 records in kTable topic, to process 1Million this logic is taking more than 2Hours.Earlier it was taking 2 to 3mins.
Earlier, when user-details were in local hash map, it used to approx 10mins to process all the data. Is there any otherway to avoid LeftJoin or improve its performance?