10

Please, imagine you have a method like the following:

public void PlaceOrder(Order order)
{
     this.SaveOrderToDataBase(order);
     this.bus.Publish(new OrderPlaced(Order));    
}

After the order is saved to the database, an event is published to the message queuing system, so other subsystems on the same or another machine can process it.

But, what happens if this.bus.Publish(new OrderPlaced(Order)) call fails? Or the machine crashes just after saving the order into the database? The event is not published and other subsystems cannot process it. This is unacceptable. If this happens I need to ensure that the event is eventually published.

What are the acceptable strategies can I use? Which is the best one?

NOTE: I don't want to use distributed transactions.

EDIT:

Paul Sasik is very close, and I think I can achieve 100%. This is what I thought:

first create a table Events in the database like the following:

CREATE TABLE Events (EventId int PRIMARY KEY)

You may want to use guids instead of int, or you may use sequences or identities.

Then do the following pseudocode:

open transaction
save order and event via A SINGLE transaction
in case of failure, report error and return
place order in message queue
in case of failure, report error, roll back transaction and return
commit transaction

All events must include EventId. When event subscribers receive an event, they first check EventId existence in database.

This way you get 100% realiability, not only 99.999%

Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
Jesús López
  • 8,338
  • 7
  • 40
  • 66
  • Ah yes. Your idea is called correlation and the GUID or int ID is called a correlation ID. You might even call it a pattern. It does increase the complexity of your code but not nearly as much as handling distributed transactions. (Also see my edit above. I just specified that there should just be a single transaction managing the table inserts.) – Paul Sasik Jun 11 '15 at 14:27
  • 1
    @PaulSasik. Yes correlation. I realized there might be a race condition with this approach. In some cases, subscribers might receive the event before transaction is committed, so they cannot see the EventId. To mitigate it, subscribers should retry getting the EventId after a short delay when they detect eventid inexistence. Depending on isolation level subscribers are blocked or not when trying to read an EventId that is on the table but not yet committed. – Jesús López Jun 12 '15 at 06:43

2 Answers2

6

The correct way to ensure the event is eventually published to the message queuing sytem is explained in this video and on this blog post

Basically you need to store the message to be sent into the database in the same transaction you perform the bussines logic operation, then send the message to the bus asynchronously and delete the message from the database in another transaction:

public void PlaceOrder(Order order)
{
     BeginTransaction();
     Try 
     {
         SaveOrderToDataBase(order);
         ev = new OrderPlaced(Order);
         SaveEventToDataBase(ev);
         CommitTransaction();
     }
     Catch 
     {
          RollbackTransaction();
          return;
     }

     PublishEventAsync(ev);    
}

async Task PublishEventAsync(BussinesEvent ev) 
{
    BegintTransaction();
    try 
    {
         await DeleteEventAsync(ev);
         await bus.PublishAsync(ev);
         CommitTransaction();
    }
    catch 
    {
         RollbackTransaction();
    }

}

Because PublishEventAsync may fail you have to retry later, so you need a background process for retrying failed sendings, something like this:

foreach (ev in eventsThatNeedsToBeSent) {
    await PublishEventAsync(ev);
}
Timo
  • 7,992
  • 4
  • 49
  • 67
Jesús López
  • 8,338
  • 7
  • 40
  • 66
  • 2
    This won't really work for databases that don't support atomic transactions. For a document store database I would save the event inside the document, a background process picks up failed events and fires them, and marks the event as complete. In case marking them as complete fails you can keep an idempotency key that is unique enough for the event like orderid for the subscribers can look up the key and do nothing if it exists already. That way there won't be side effects of calling the event twice if the background process tries it twice due to failing to mark it complete. – dukethrash Oct 12 '18 at 13:59
  • 1
    Well, I'm assuming you are working with a system the supports atomic transactions. Incidentally, there are several NoSQL databases that support atomic transactions such as RavenDb – Jesús López Oct 13 '18 at 07:59
  • 1
    @JesúsLópez two questions: 1. what if CommitTransaction() in PublishEventAsync fails? Does this mean eventually in your retry process ev would be retired again, but assuming idempotentcy of the consumer of ev, this would not matter? 2. What if the server crashes in PublishAsyncEvent after bus.publish but before the Commit Transaction? Is the same scenario as 1. above? – GreenieMeanie Jun 15 '20 at 23:28
  • 1
    @GreenieMeanie, 1.- the event would be published twice, the client should implement idempotency based on EventId uniqueness. 2. yes, it's the same scenario as 1. – Jesús López Jun 16 '20 at 06:09
2

You can make the this.bus.Publish call part of a database transaction of the this.SaveOrderToDataBase. This means that this.SaveOrderToDataBase executes in transaction scope and if the db call fails you never call the mq and if the mq call fails then you roll back the db transaction leaving both systems in a consistent state. If both calls succeed you commit the db transaction.

Pseudocode:

open transaction
save order via transaction
in case of failure, report error and return
place order in message queue
in case of failure, report error, roll back transaction and return
commit transaction

You didn't mention any specific db technology so here's a link to a wiki article on transactions. Even if you're new to transactions, it's a good place to start. And a bit of good news: They are not hard to implement.

Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
  • 1
    Good attempt!. But this strategy has a problem, the event might be published and the order not saved to the datablase if commit transaction fails or the machine crashes just after placing the order in message queue – Jesús López Jun 11 '15 at 12:51
  • @JesúsLópez - True, but doing a db transaction gets you darn close, like 99.9999% (probably even better) and I don't think you can do better without distributing the transaction among several machines. – Paul Sasik Jun 11 '15 at 12:56
  • Yes you are right. However I think you can do better, please see my edit. – Jesús López Jun 11 '15 at 13:16