We have a stream of web events.
The event is partitioned by (domain, uid).
All events explained here are from same domain. There are thousands of domains, very uneven in traffic (hence that partitioning).
Let's say we have events from one unregistered user (uid1). We have events that come from the same unregistered user from a separate device, which creates a new uid (let's call it uid2).
When we have a registration on uid1, it registers with an email (email1). Later, from second device, it logs in - so we can know both uids come from the same user.
When this happens, we could check a state store for the user identifier (e.g. email) on login to see if it exists and hence get the correct user.
However, since they are different uids, they will not be copartitioned. Partitioning just by domain instead of (domain, uid) is not desirable.
Separately, size of such a user store may be very big to be kept in each of the application instances (millions of records), so it may be too much for a GlobalKTable store.
How to work this out?