I am building a high volume system that will be processing up to a hundred million messages everyday. I have a microservice that is reading from a Kafka topic and doing some basic processing on them before forwarding them to the next microservice.
Kafka Topic -> Normalizer Microservice -> Ordering Microservice
Below is what the processing would look like:
- Normalizer would be concurrently picking up messages from the Kafka topic.
- Normalizer would read the messages from the topic and post them to an in-memory seda queue from where the message would be subsequently picked up, normalized and validated.
- This normalization, validation and processing is expected to take around 1 second per message. Within this one second, the message will be stored to the database and will become persistent in the system.
My concern is that during this processing, if a message has been already read from the topic and posted to the seda queue and has either
- not yet been picked up from the seda queue or,
- has been picked up from the seda queue and is currently processing and has not yet been persisted to the database
and the Normalizer JVM crashes or is force-killed (kill -9), how do I ensure that I do NOT lose the message?
It is critical that I do NOT drop/lose any messages and even in case of a crash/failure, I should be able to retain the message such that I can trigger re-processing of that message if required.
One naïve approach that comes to mind is to push the message to a cache (which will be a very fast operation).
Read from topic -> Push to cache -> Push to seda queue
Needless to say, the problem still exists, it just makes it less probable that I will lose the message. Also, this is certainly not the smartest solution out there.
Please share your thoughts on how I can design this system such that I can preserve messages on my side once the messages have been read off of the Kafka topic even in the event of the Normalizer JVM crashing.