I have done a lot of research on this, but I am still not able to get something suitable. Everywhere, I go, I see that the easiest way is to call saveToEs()
and then commit offsets after that. My question is what if saveToEs()
fails for some reason?
What is the correct way to store offsets in Kafka when we're using Spark streaming job and storing our documents in ES. I tried using BulkProcessorListener
and stored offsets manually (keeping track of sorted offsets and requests and what not), but it got out of hand and the approach seemed to complicated for such a general task.
Can someone guide me?
Anyone interested in my approach, here is the question that explains it Commit Offsets to Kafka on Spark Executors