1

My Rust code uses RwLock to process data in multiple threads. Each thread fills a common storage while using the read lock (e.g. filling up a database, but my case is a bit different). Eventually, the common storage will fill up. I need to pause all processing, reallocate storage space (e.g. allocate more disk space from cloud), and continue.

// psudo-code
fn thread_worker(tasks) {
  let lock = rwlock.read().unwrap();
  for task in tasks {
    // please ignore out_of_space check race condition
    // it's here just to explain the question 
    if out_of_space {
      drop(lock);
      let write_lock = rwlock.write().unwrap();
      // get more storage
      drop(write_lock);
      lock = rwlock.read().unwrap();
    }
    // handle task WITHOUT getting a read lock on every pass
    // getting a lock is far costlier than actual task processing
  }
  drop(lock);
}

Since all threads will quickly hit out of space at about the same time, they can all release the read lock, and get a write. The first thread that gets the write lock will fix the storage issue. But now I have a possible temporary deadlock situation - all other threads are also waiting for the write lock even though they no longer need it.

So it is possible for this situation to happen: given 3 threads all waiting for write, the 1st gets the write, fixes the issue, releases write, and waits for read. The 2nd enters write but quickly skips because issue already fixed and releases. The 1st and 2nd threads will enter read and continue processing, but the 3rd is still waiting for write and will wait for it for a very long time until the first two either run out of space or finish all their work.

Given all threads waiting for write, how can I "abort" all other thread's waits from the first thread after it finishes its work, but before it releases the write lock it already got?

I saw there is a poisoning feature, but that was designed for panics, and reusing it for production seems wrong and tricky to get done correctly. Also Rust devs are thinking of removing it.

P.S. Each loop iteration is essentially a data[index] = value assignment, where data is a giant memmap shared by many threads. The index is slowly growing in all threads, so eventually all threads run out of memmap size. When that happens, memmap is destroyed, file reallocated, and a new memmap is created. Thus, it is impossible to get a read lock on every loop iteration.

Yuri Astrakhan
  • 8,808
  • 6
  • 63
  • 97
  • 1
    You can do something like that with [`parking-lot`](https://crates.io/crates/parking_lot) which provides a `RwLock` with an `unlock_fair` function to make sure that the thread that just released the write lock won't reacquire the read lock immediately, and a `try_write_for` function that includes a timeout in case one of the other threads still managed to reacquire the read lock. – Jmb Nov 21 '21 at 09:21
  • Fair locking won't help because it's *readers* that stall the other would-be writers. `try_write_for()` would help, but would require tweaking the timeout values. A simpler and more effective fix is to have only one thread do the writing, and the others to fall back to reading; details in the answer I've now written. – user4815162342 Nov 21 '21 at 19:21
  • @user4815162342 Fair locking will help prevent the thread that got the write lock from reacquiring the read lock immediately after releasing the write lock, thus preventing the other threads from getting the write lock in turn. `try_write_for` with a reasonably short timeout will help them proceed if somebody else reacquired the read lock out of turn. – Jmb Nov 23 '21 at 18:07
  • @Jmb *Fair locking will help prevent the thread that got the write lock from reacquiring the read lock* - I understood the problem was in _other_ threads acquiring the read lock, and thus preventing would-be writers from proceeding for a long time. The `try_write_for()` solution feels like somewhat of a hack, but I guess it could be made to work. I'm not sure how the code in the accepted answer is supposed to help, though. – user4815162342 Nov 23 '21 at 19:29
  • The extra mutex in accepted answer solves the core problem. The first thread to `out_of_space` condition will grab mutex and ask for write. By def above, other threads will quickly get same condition, release read, but won't enter mutex, so 1st thread will get write lock, resize, and release write, release mutex, and get read. Other threads will get the mutex, but will not try to get a write, but simply will re-get read lock -- problem solved. – Yuri Astrakhan Nov 23 '21 at 20:03
  • @user4815162342 except that when one thread finishes reallocating, the other threads are all attempting to acquire the write lock, so one of them will get it, release it immediately (because reallocation is no longer needed), attempt to get the read lock but be sent to the back of the queue thanks to fair locking. – Jmb Nov 23 '21 at 21:57
  • 1
    In fact the timeout is not needed. Re-reading [the parking-lot documentation](https://docs.rs/parking_lot/0.11.2/parking_lot/type.RwLock.html): _readers trying to acquire the lock will block even if the lock is unlocked when there are writers waiting to acquire the lock_ so everything will work out since by the time any thread attempts to re-acquire the read lock, the others that are waiting on the write lock will take precedence. – Jmb Nov 23 '21 at 22:03
  • BTW [the same is true for systems where standard locks rely on pthread](https://linux.die.net/man/3/pthread_rwlock_unlock) (e.g. Linux) although not guaranteed by the Rust stdlib. – Jmb Nov 23 '21 at 22:05
  • @Jmb *the other threads are all attempting to acquire the write lock* - This is important and is not how I understood the question. I understood that only _some_ threads are attempting to write at once, while there are still others that are attempting to read all the time. After re-reading the question, I see that you are right. And I also see a flaw with my answer: the thread that gets to flip the `AtomicBool` now races for the write lock with the othres, which race for the read lock. – user4815162342 Nov 23 '21 at 22:23
  • @Jmb So basically, all the OP needs to do is switch to `parking_lot::RwLock`, and the problem will go away? – user4815162342 Nov 23 '21 at 22:26
  • Thank you @Jmp, I did not realize parking lot version actually solves this as well, and is simpler! Yet, I think the extra mutex answer is tiny bit faster -- if a thread got a mutex, it can quickly test, drop mutex, get read lock, and continue its work without waiting for all other threads to go through that. With the parking lot, all threads are paused until all threads go through the writelock+test. – Yuri Astrakhan Nov 23 '21 at 22:37
  • P.S. @Jmb, please post your answer too - while i think mutex version is faster (because of my explanation just above and also down below in comments), I think in many cases yours will be far easier and simpler for other users, and definitely deserves a +1. – Yuri Astrakhan Nov 23 '21 at 22:48

2 Answers2

3

First note that depending on your target platform, your code may already work as is. For example for platforms where Rust threads rely on libpthread (e.g. Linux), and any platform where write locks take precedence over read locks.

If you want a cross-platform solution, all you need to do is switch to parking-lot which provides a fair implementation of a RwLock. In particular this means that readers trying to acquire the lock will block even if the lock is unlocked when there are writers waiting to acquire the lock.

Here's the sequence of events with a fair RwLock:

  • Initially all threads are running and hold the read lock.
  • First thread to run out of space releases the read lock and requests the write lock. Since the other threads still hold the read lock, the first thread is blocked.
  • One after the other, the other threads run out of space, release the read lock and request the write lock.
  • Once all the threads have released the read lock, one of them acquires the write lock.
  • The thread that got the write lock allocates more memory, releases the write lock and requests the read lock. Since the other threads waiting on the write lock take precedence, the read request blocks.
  • One after the other, the other threads acquire the write lock, notice that there is memory available, release the write lock and request the read lock.
  • Once all the threads have acquired and released the write lock, they all acquire the read lock and proceed.

Note that there is a theoretical race condition that could keep one of the thread blocked once memory has been allocated if the other threads are able to proceed in the time it takes to release the read lock and request the write lock, e.g.:

drop(lock);
// Another thread gets the write lock, allocates memory and releases the lock
// All the other threads acquire and release the write lock
// At least one other thread acquires the read lock
let write_lock = rwlock.write().unwrap();

Given the time it takes to allocate memory alone, the probability of this happening in real life is so vanishingly small that it can be discounted.

Jmb
  • 18,893
  • 2
  • 28
  • 55
  • Thanks, and a good catch with the race condition. I will keep the other answer as accepted because of 3 reasons: it does not require 3rd party crate, it has no race condition and because it does not require all threads to wait for all write locks to finish before continuing (admiringly a very short wait because no work is actually being done inside the write lock). I do agree that it is cleaner though :) – Yuri Astrakhan Nov 24 '21 at 17:09
0

Looking at your code, you could get away with an extra mutex:

// pseudo-code
fn thread_worker(tasks) {
  for task in tasks {
    if out_of_space {
      drop(lock);
      {
        let mutex = mutex.lock();      
        if out_of_space { // potentially updated by another worker
          let write_lock = rwlock.write();
          // get more storage
          ...
          // drop(write_lock); is automatic here
        }
        // drop(mutex); is automatic here
      }
      lock = rwlock.read();
    }

    // copy memory for the task
    ...
  }
}

The pattern used here is known as a Double-checked locking.

This solves the issue you have that after reallocation the next party is not gonna wait on rwlock.write forever, because it will not pass the out_of_space check inside the mutex critical section.

However this solution still has an issue that the first failed worker will wait for all the other workers to encounter out_of_space condition before it can proceed with reallocation, because it needs to wait for all read() locks to be dropped.

I'd recommend to refactor this code to move the reallocation logic out of this method.

Also try to avoid explicit drops if possible in favor or RAII which is usually a good practice.

battlmonstr
  • 5,841
  • 1
  • 23
  • 33
  • @YuriAstrakhan is this some kind of video/audio processing? Have you checked that it is too slow to lock/unlock? RwLock seems to be based on pthread rwlock, which is heavily optimized doing some optimistic atomic checks (at least on Linux). Otherwise, are you able to predict or calculate how big is the task space requirement size? In this case one strategy would be to preallocate processing space in advance, and only run until the space is full. Then reallocate and recreate the workers on the manager level. – battlmonstr Nov 21 '21 at 22:18
  • My apologies, I misspoke. I do control the entire `thread_workers` fn. The body of the loop is just a simple array assignment by index -- `data[index] = value`, where the `data` is a multi-GBs memory map shared between threads. There is no way to subdivide threads to work on independent data segments, but I can guarantee threads won't collide on index. So locking on each assignment will kill perf. – Yuri Astrakhan Nov 21 '21 at 22:46
  • @YuriAstrakhan . I see. Thanks for clarification. 1) What happens if you just have a mutex? What does the .read() protect from? 2) Is it possible to finish the worker when out_of_space happens, handle it in the manager, and recreate the worker? – battlmonstr Nov 22 '21 at 10:05
  • 1) read() protects from writing to the data while it is being destroyed and recreated -- if thread 1 wants to grow data (destroy memmap, grow file, recreate memmap), thread 2 could still be setting values on it. 2) worker gets a data block it must process, and it cannot finish until it is done - essentially it goes through an iterator and writes to memmap. – Yuri Astrakhan Nov 22 '21 at 14:58
  • @YuriAstrakhan I see. Is it possible to make your read() lock a bit more granular? It has to be unlocked one way or the other in order for the writers to be able to proceed. If not every iteration, then maybe unlock it every N iterations or every N Mb written? Would that work? – battlmonstr Nov 22 '21 at 16:52
  • I've updated the code to illustrate this. – battlmonstr Nov 22 '21 at 16:59
  • battlmonst, thx for all your help! My full code doesn't yet compile (I'm still learning Rust), but it should give more explanation https://gist.github.com/nyurik/daea497c122e92ef2637904fc284fd6c – Yuri Astrakhan Nov 22 '21 at 23:07
  • @YuriAstrakhan I've updated the answer based on your code. – battlmonstr Nov 23 '21 at 12:43
  • 1
    This won't work because there is a fair chance that whatever thread got the write lock will also reacquire the read lock immediately after releasing the write lock, thus preventing other threads from acquiring the write lock in turn. – Jmb Nov 23 '21 at 18:03
  • 1
    Also what's the point of the extra mutex? The write lock already ensures that you have exclusive access (ie. all the read locks have been dropped). – Jmb Nov 23 '21 at 18:12
  • 1
    Thx @Jmb, but i think you are mistaken. If a thread got mutex + write lock, grew storage, released write+mutex, and aq the read lock, another thread will not even try to get the write lock. Another thread is waiting on MUTEX, not write lock. And it will only wait on write lock inside the mutex lock, and only if it is still out of storage. – Yuri Astrakhan Nov 23 '21 at 22:42
  • 1
    @YuriAstrakhan Yes, I realized after commenting that the mutex allows the thread that allocated the memory to reacquire the read lock and proceed without waiting for the others to notice that the memory has been allocated. Might be worth mentioning explicitly in the answer. – Jmb Nov 24 '21 at 08:09