Let's assume a practical case:
We have a customer service that publishes
CustomerCreated/CustomerUpdated
events to a customer Kafka topic.A shipping service listens to an order topic
When an
OrderCreated
event is read by the shipping service, it will need access to the customer address. Instead of making a REST call to the customer service, shipping service will already have the user information available locally. It is kept in aKTable
/GlobalKTable
with persistent storage.
My questions are about how we should implement this: we want this system to be resilient and scalable so there will be more than one instance of the customer and shipping services, meaning there will also be more than one partition for the customer and order topics.
We could find scenarios like this: An OrderCreated(orderId=1, userId=7, ...)
event is read by shipping service but if it uses a KTable
to keep and access the local user information, the userId=7
may not be there because the partition that handles that userId could have been assigned to the other shipping service instance.
Offhand this problem could be solved using a GlobalKTable
so that all shipping service instances have access to the whole range of customers.
Is this (
GlobalKTable
) the recommended approach to implement that pattern?Is it a problem to replicate the whole customer dataset in every shipping service instance when the number of customers is very large?
Can this/should this case be implemented using
KTable
in some way?