Highly-Distributed OLTP Architecture

Question

Is there a known architectural solution for a highly-distributed OLTP situations where pre-conditions apply? For example, let's use a banking example. Person A wants to transfer $N to Person B. The pre-conditions for this to succeed are that Person A must have more than $N in their account.

From the perspective of Person A, they log into some web application. They create a transfer from themselves to Person B for $N. Keep in mind that in the background, money is being withdrawn and deposited from Person A's account in real-time as this transfer is being applied and as the transfer is being created. The money may exist before the create, but once the transfer is applied, it may not. In other words, this couldn't be a client-side validation. Person A would like to know that this transfer has succeeded or failed synchronously. Person A would not like to submit the transfer asynchronously and then return later to a queue or some notification that the transfer has failed.

Is there a known architecture that solves this problem at large-scale? If all accounts are in a single RDBMS, then you can do something like this via built-in transactional capabilities. But if you are using an eventually consistent NoSQL style datastore, or a log/message based infrastructure like Kafka, in there a known solution to problems like this?

I'm not sure if you need an approach or a tool, but you might want to have a look at this : http://blog.cask.co/2014/11/how-we-built-it-designing-a-globally-consistent-transaction-engine/ — guillaume31, Jun 07 '16 at 11:21

score 0 · Answer 1 · answered Jun 08 '16 at 07:47

Basically what you need is a distributed locking mechanism. Many of the distributed server applications provide such a feature.

Basically if we convert your question into code it will look like this

// BANK WITHDRAWAL APPLICATION

// Fetch BankAccount object from NCache
BankAccount account = cache.Get("Key") as BankAccount; // balance = 30,000
Money withdrawAmount = 15000;

if (account != null && account.IsActive)
{
    // Withdraw money and reduce the balance
    account.Balance -= withdrawAmount;

    // Update cache with new balance = 15,000
    cache.Insert("Key", account);
}

=========================

// BANK DEPOSIT APPLICATION

// Fetch BankAccount object from NCache
BankAccount account = cache.Get("Key") as BankAccount; // balance = 30,000
Money depositAmount = 5000;

if (account != null && account.IsActive)
{
    // Deposit money and increment the balance
    account.Balance += depositAmount;

    // Update cache with new balance = 35,000
    cache.Insert("Key", account); 
}

This is basically an example of race-condition

A race condition is when two or more users try to access and change the same shared data at the same time but end up doing it in the wrong order.

The answer to the above code in distributed locking would be

LockHandle lockHandle = new LockHandle();

// Specify time span of 10 sec for which the item remains locked
// NCache will auto release the lock after 10 seconds.
TimeSpan lockSpan = new TimeSpan(0, 0, 10); 

try
{
    // If item fetch is successful, lockHandle object will be populated
    // The lockHandle object will be used to unlock the cache item
    // acquireLock should be true if you want to acquire to the lock.
    // If item does not exists, account will be null
    BankAccount account = cache.Get(key, lockSpan, 
    ref lockHandle, acquireLock) as BankAccount;
    // Lock acquired otherwise it will throw LockingException exception

    if(account != null && account.IsActive)
    {
        // Withdraw money or Deposit
        account.Balance += withdrawAmount;
        // account.Balance -= depositAmount;

        // Insert the data in the cache and release the lock simultaneously 
        // LockHandle initially used to lock the item must be provided
        // releaseLock should be true to release the lock, otherwise false
        cache.Insert("Key", account, lockHandle, releaseLock); 
    }
    else
    {
        // Either does not exist or unable to cast
        // Explicitly release the lock in case of errors
        cache.Unlock("Key", lockHandle);
    } 
}
catch(LockingException lockException)
{
    // Lock couldn't be acquired
    // Wait and try again
}

This answer is very specific to NCache (A Distributed Cache). I'm sure you'll find more solutions under the keyword `Distributed Locking"

Source

I think the question he asked was to know what is required from an architectural and infrastructure stand point — I.Tyger, Sep 13 '18 at 14:18

score 0 · Answer 2 · answered Dec 20 '16 at 18:52

Have you taken a look at Splice Machine? It is a RDBMS that is fully ACID compliant that runs on top of the hadoop stack (hbase, spark, hdfs, zookeeper). They have a dual-architecture that uses hbase for quick OLTP queries and spark for OLAP queries and it has built in transactional capabilities that do not require any locking.

score 0 · Answer 3 · answered Mar 19 '19 at 22:26

0

ClustrixDB is another solution that might be worth checking out. It uses Paxos for distributed transaction resolution (built into a distributed, ACID, SQL compliant RDBMS) and also has built-in fault tolerance.

answered Mar 19 '19 at 22:26

lucygucy

56
2

Highly-Distributed OLTP Architecture

3 Answers3