0

I have a KStream with userClick and using userID as key also a KTable containing user details also using userID as key. Both KStream and KTable has same number of partitions, use same partitioning strategies and use same keys.

When i use left join between these two majority of the click events are not being matched with user details, there are some matched. But when i change KTable with GlobalKTable these missing matches disappear all required user clicks are enriched with user details.

What can cause this issue? Does using a KeyValueMapper when joining KStream and GlobalKTable resolve the issue in KStream to Ktable join? if so what can be the solution.

Edit: UserId is a Compacted topic and being generated by Confluent .net client, i have changed the default partition strategy to murmur2(Java client's default config).

YamYamm
  • 381
  • 1
  • 3
  • 12
  • Duplicate of https://stackoverflow.com/questions/60588408/kstream-to-ktable-inner-join-producing-different-number-of-records-every-time-pr – Matthias J. Sax Mar 10 '20 at 05:12

1 Answers1

0
  1. Does using a KeyValueMapper when joining KStream and GlobalKTable resolve the issue in KStream to Ktable join?

    IMO if we using GlobalKTable then we'll lose Kafka ability to scale on user table.

  2. What can cause this issue? Can you debug some userID cases in which the user data is not enriched? then check the userID partition number of both click stream and user table.

Tuyen Luong
  • 1,316
  • 8
  • 17
  • 1. Yes using globalKTable resolves the issue, and yes I don't want to use GlobalKTable to be able to scale out. 2. In joining with KTable case, not enriched user's are anonymous users so that's expected, How can i get partition numbers in Kafka Streams? – YamYamm Mar 09 '20 at 10:30
  • @YamYamm it depends on your partitioning strategy, if you using default strategy you can use this `org.apache.kafka.common.utils.Utils.toPositive(Utils.murmur2(key.getBytes())) % numPartitions` – Tuyen Luong Mar 09 '20 at 10:57