I've been reading some articles and questions on eventual consistency and choreographing microservices, but I haven't seen a clear answer to this question. I'll phrase it in generic terms.
In a nutshell: if a client historically makes subsequent synchronous REST calls to your system, what do you do when the later calls may return unexpected results once the calls are made to different microservices (due to eventual consistency)?
Problem
Suppose you have a monolithic application that provides a REST API. Let's say there are two modules A and B you want to convert to microservices. The entities that B maintains can refer to entities that A maintains (e.g. A maintains students and B maintains classes). In the monolithic situation, the modules simply refer to the same database, but in the microservices situation, they each have their own database and communicate via asynchronous messages. So their databases are eventually consistent with respect to each other.
Some existing third-party client applications of our API are used to first (synchronously) calling an endpoint belonging to module A and, after that first call returns, immediately (i.e. a few ms later) calling an endpoint in module B as part of their workflow (e.g. creating a student and putting it in a class). In the new situation, this leads to a problem: when the second call happens, module B may not be aware of the changes in module A yet. So the existing workflow of the client application may break. (E.g. module B may respond: the student you're trying to put in the class doesn't exist, or it is in the wrong year.)
When the calls are done separately by a human user through some frontend application, this is not a big issue, as the modules are usually consistent after a second anyway. The problem arises when a client application (which is not under our control) just calls A and then immediately B as part of an automated workflow. The eventual consistency is simply not fast enough in this instance.
A simple diagram that describes the situation
Question
Is there a best practice, or a generally agreed upon set of options, to mitigate this problem? (I made up the student/class example, don't get hung up on the specifics of that. :))
What we can think of
- Simply telling the developers of these clients: from now on, you have to implement a retry mechanism for every endpoint you call. The drawback seems obvious.
- Implement an API gateway that waits until B is ready. Drawback: there are many conceivable workflows (involving more modules A-Z) that would require this, so the gateway might become quite complex.
- Somehow create a "session" for the client that tracks which requests it has made in succession. Then B can figure out whether it should wait for a message from A, or it could even update its state just by looking at the precise request the client made to A.
Are there better methods? Which would be most suitable?
Edit: Clarified that the question primarily concerns the behaviour of third-party clients that call the endpoints in an automated way, meaning that even a few milliseconds 'lag' in the eventual consistency can be fatal.