3

I'll illustrate my question with Twitter. For example, Twitter has microservice-based architecture which means that different processes are in different servers and have different databases.

A new tweet appears, server A stored in its own database some data, generated new events and fired them. Server B and C didn't get these events at this point and didn't store anything in their databases nor processed anything.

The user that created the tweet wants to edit that tweet. To achieve that, all three services A, B, C should have processed all events and stored to db all required data, but service B and C aren't consistent yet. That means that we are not able to provide edit functionality at the moment.

As I can see, a possible workaround could be in switching to immediate consistency, but that will take away all microservice-based architecture benefits and probably could cause problems with tight coupling.

Another workaround is to restrict user's actions for some time till data aren't consistent across all necessary services. Probably a solution, depends on customer and his business requirements.

And another workaround is to add additional logic or probably service D that will store edits as user's actions and apply them to data only when they will be consistent. Drawback is very increased complexity of the system.

And there are two-phase commits, but that's 1) not really reliable 2) slow.
I think slowness is a huge drawback in case of such loads as Twitter has. But probably it could be solved, whereas lack of reliability cannot, again, without increased complexity of a solution.

So, the questions are:

  1. Are there any nice solutions to the illustrated situation or only things that I mentioned as workarounds? Maybe some programming platforms or databases?
  2. Do I misunderstood something and some of workarounds aren't correct?
  3. Is there any other approach except Eventual Consistency that will guarantee that all data will be stored and all necessary actions will be executed by other services?

Why Eventual Consistency has been picked for this use case? As I can see, right now it is the only way to guarantee that some data will be stored or some action will be performed if we are talking about event-driven approach when some of services will start their work when some event is fired, and following my example, that event would be “tweet is created”. So, in case if services B and C go down, I need to be able to perform action successfully when they will be up again.

Things I would like to achieve are: reliability, ability to bear high loads, adequate complexity of solution. Any links on any related subjects will be very much appreciated.

If there are natural limitations of this approach and what I want cannot be achieved using this paradigm, it is okay too. I just need to know that this problem really isn't solved yet.

cassandrad
  • 3,412
  • 26
  • 50
  • Although this is a very interesting topic this doesn't seem a valid SO question. – Constantin Galbenu Apr 04 '17 at 16:45
  • @ConstantinGALBENU it is about programming, it has concrete problem described. It's not even opinion based, because particular requirements to the solution are mentioned. Why it isn't a valid question? Looks like any other question about programming paradigm or development pattern. – cassandrad Apr 04 '17 at 16:50
  • In particular example that you have given, I would rather choose to store the tweet in common DB and will interact with it using write-through data grid.Because the requirement demands us to have the common data store. – Jaydeep Rajput Apr 04 '17 at 18:19

2 Answers2

0

It is all about tradeoffs. With eventual consistency in your example it may mean that the user cannot edit for maybe a few seconds since most of the eventual consistent technologies would not take too long to replicate the data across nodes. So in this use case it is absolutely acceptable since users are pretty slow in their actions.

For example :

MongoDB is consistent by default: reads and writes are issued to the primary member of a replica set. Applications can optionally read from secondary replicas, where data is eventually consistent by default.

from official MongoDB FAQ

Another alternative that is getting more popular is to use a streaming platform such as Apache Kafka where it is up to your architecture design how fast the stream consumer will process the data (for eventual consistency). Since the stream platform is very fast it is mostly only up to the speed of your stream processor to make the data available at the right place. So we are talking about milliseconds and not even seconds in most cases.

Oswin Noetzelmann
  • 9,166
  • 1
  • 33
  • 46
  • 1
    “most of the eventual consistent technologies would not take too long”. It can take any time, from minutes to days: network issues, third-party services issues, maintenance; anything could slow processing. And even streaming does not help here. – cassandrad Apr 05 '17 at 07:28
0

The key thing in these sorts of architectures is to have each service be autonomous when it comes to writes: it can take the write even if none of the other application-level services are up.

So in the example of a twitter like service, you would model it as

Service A manages the content of a post

So when a user makes a post, a write happens in Service A's DB and from that instant the post can be edited because editing is just a request to A.

If there's some other service that consumes the "post content" change events from A and after a "new post" event exposes some functionality, that functionality isn't going to be exposed until that service sees the event (yay tautologies). But that's just physics: the sun could have gone supernova five minutes ago and we can't take any action (not that we could have) until we "see the light".

Levi Ramsey
  • 18,884
  • 1
  • 16
  • 30