I am working on a Mule API flow testing out the Salesforce event streams. I have my connector set up and subscribed to a streaming channel.
This is working just fine when I create / update / delete contact records, the events come through and I process them by adding them to another database.
I am slightly confused with the replayId
functionality. With the current setup, I can shut down the Mule app, create contacts in the org, and then when I bring the app back online, it resumes by adding data from where it left off. Perfect.
However, I am trying to simulate what would happen if the mule app crashed while processing the events.
I ran some APEX to create 100 random contact records. As soon as I see it log the first flow in my app, I kill the mule app. My assumption here was that it would know where it left off when I resume the app, as if it was offline prior to the contact creation like in the previous test.
What I have noticed is that it only processes the few contacts that made it through before I shut the app down.
It appears that the events may be coming in so quickly in the flow input, that it has already reached the last replayId
in the stream. However, since these records still haven't been added to my external database, I am losing those records. The stream did what it was supposed to do, but due to the batch of work the app is still processing, my 100 records are not being committed like the replayId
reflects.
How can I approach this so that I don't end up losing data in the event there is a large stream of data prior to an app crash? I remember with Kafka, you had to were able to commit
the id once it was inserted into the database so that it knew that the last one you officially processed. Is there such a concept in Mule where I can tell it where I have officially left off and committed to the DB?