3

Looking for a minimal, futex-based implementation of a single-writer/multiple-readers lock requiring no space overhead beyond a single 4-byte futex state variable.

Some background: I have an application which will embed a lock within each of tens to hundreds of millions of small objects. Because of the very fine grained nature of the locking and the structure of the application I anticipate minimal contention. Further, writers will be rare and contending writers rarer still. For all of these reasons, in this particular setting, a solution prone (in theory) to the "thundering heard" phenomenon is quite acceptable.

Thilo
  • 257,207
  • 101
  • 511
  • 656
John Yates
  • 1,027
  • 1
  • 7
  • 18
  • You cannot build a working, POSIX-compliant rwlock using nothing but a single futex. Please clearly define the desired behavior of your rwlock. For example: Must it support recursive read locking? Must a blocked writer prevent further read locks, or is starving writers not a concern? It would be enough to provide a documented version of the API you expect. – Jeremy W. Sherman Oct 19 '10 at 17:09
  • Jeremy, If you reread the description of my problem I never indicated a need for POSIX compliance. I expect minimal contention but require support for both shared and exclusive locking. – John Yates Dec 19 '10 at 00:55
  • (Continuing previous comment) Having received no pointers to potential solutions I ended rolling my own. My implementation is indeed futex-based. It provides 1, 2 and 4 byte variants, supports up to 63 readers in a 1 byte lock and 16383 readers in a 2 byte lock. On an x86 non-contended transitions typically require a single atomic cmpxchg. My company was just bought by IBM. I am checking if I can release the code under GPL. – John Yates Dec 19 '10 at 01:06
  • Beware of Smoku's code, it has a bug. When a write lock is released it will only wake 1 reader instead of all of them. – hendo Jun 26 '17 at 08:14

1 Answers1

1

You will find my implementation at https://gist.github.com/smokku/653c469d695d60be4fe8170630ba8205

The idea is that there can be only one thread taking the lock for write (futex value 0), lock can be open (futex value 1) or there can be many reading threads (futex values greater than 1). So values below 1 (there is only one) block both readers and writers on futex, and values above 1 block only writers. Unlocking thread wakes one of waiting threads, but you need to be careful not to consume a readers only wake by a writer thread.

#define cpu_relax() __builtin_ia32_pause()
#define cmpxchg(P, O, N) __sync_val_compare_and_swap((P), (O), (N))

static unsigned _lock = 1; // read-write lock futex
const static unsigned _lock_open = 1;
const static unsigned _lock_wlocked = 0;

static void _unlock()
{
    unsigned current, wanted;
    do {
        current = _lock;
        if (current == _lock_open) return;
        if (current == _lock_wlocked) {
            wanted = _lock_open;
        } else {
            wanted = current - 1;
        }
    } while (cmpxchg(&_lock, current, wanted) != current);
    syscall(SYS_futex, &_lock, FUTEX_WAKE_PRIVATE, 1, NULL, NULL, 0);
}

static void _rlock()
{
    unsigned current;
    while ((current = _lock) == _lock_wlocked || cmpxchg(&_lock, current, current + 1) != current) {
        while (syscall(SYS_futex, &_lock, FUTEX_WAIT_PRIVATE, current, NULL, NULL, 0) != 0) {
            cpu_relax();
            if (_lock >= _lock_open) break;
        }
        // will be able to acquire rlock no matter what unlock woke us
    }
}

static void _wlock()
{
    unsigned current;
    while ((current = cmpxchg(&_lock, _lock_open, _lock_wlocked)) != _lock_open) {
        while (syscall(SYS_futex, &_lock, FUTEX_WAIT_PRIVATE, current, NULL, NULL, 0) != 0) {
            cpu_relax();
            if (_lock == _lock_open) break;
        }
        if (_lock != _lock_open) {
            // in rlock - won't be able to acquire lock - wake someone else
            syscall(SYS_futex, &_lock, FUTEX_WAKE_PRIVATE, 1, NULL, NULL, 0);
        }
    }
}
smokku
  • 1,256
  • 13
  • 22