This is more of a picking your brain based on your experience type of question as I am not able to find resources that can help me decide one way or the other.
I have a kafka message queue where around 7-8 million events are streamed every day. The messages eventually need to get persisted in a mysql database.
Approach 1:
I can either write microservices, containerize them and have multiple instances of the container app run with different kafka consumer groups and the kafka listener in each instance that consume events and shove them in to mysql.
Approach 2:
Another approach I was thinking about was just use a spark job and have it process the stream of events and persist them in mysql db. that way I don't have to worry about managing the container app and keep the operations cost down etc.
Given the volume of data, I am not sure if spark is going to be an over kill and the cost of spark would be more than the capital and operations expenses I would incur on a container app let's say on managed kubernetes environment etc.
Can someone guide me how to go about his?