Preserve messages during unprecedented JVM crash in a high throughput system

Question

I am building a high volume system that will be processing up to a hundred million messages everyday. I have a microservice that is reading from a Kafka topic and doing some basic processing on them before forwarding them to the next microservice.

Kafka Topic -> Normalizer Microservice -> Ordering Microservice

Below is what the processing would look like:

Normalizer would be concurrently picking up messages from the Kafka topic.
Normalizer would read the messages from the topic and post them to an in-memory seda queue from where the message would be subsequently picked up, normalized and validated.
This normalization, validation and processing is expected to take around 1 second per message. Within this one second, the message will be stored to the database and will become persistent in the system.

My concern is that during this processing, if a message has been already read from the topic and posted to the seda queue and has either

not yet been picked up from the seda queue or,
has been picked up from the seda queue and is currently processing and has not yet been persisted to the database

and the Normalizer JVM crashes or is force-killed (kill -9), how do I ensure that I do NOT lose the message?

It is critical that I do NOT drop/lose any messages and even in case of a crash/failure, I should be able to retain the message such that I can trigger re-processing of that message if required.

One naïve approach that comes to mind is to push the message to a cache (which will be a very fast operation).

Read from topic -> Push to cache -> Push to seda queue

Needless to say, the problem still exists, it just makes it less probable that I will lose the message. Also, this is certainly not the smartest solution out there.

Please share your thoughts on how I can design this system such that I can preserve messages on my side once the messages have been read off of the Kafka topic even in the event of the Normalizer JVM crashing.

Outline of a solution: you do NOT acknowledge the message to Kafka until you are certain it has been processed. So, you do not lose messages, but if the system crashes at the right time, you may end up processing messages more than once. You have to make the consumer idempotent, e.g. store the ids of processed messages in a database and, if you encounter the same id, just skip processing and just acknowledge it. — Nikos Paraskevopoulos, Jan 09 '22 at 19:47

Preserve messages during unprecedented JVM crash in a high throughput system

0 Answers0