My Rust code uses RwLock
to process data in multiple threads. Each thread fills a common storage while using the read
lock (e.g. filling up a database, but my case is a bit different). Eventually, the common storage will fill up. I need to pause all processing, reallocate storage space (e.g. allocate more disk space from cloud), and continue.
// psudo-code
fn thread_worker(tasks) {
let lock = rwlock.read().unwrap();
for task in tasks {
// please ignore out_of_space check race condition
// it's here just to explain the question
if out_of_space {
drop(lock);
let write_lock = rwlock.write().unwrap();
// get more storage
drop(write_lock);
lock = rwlock.read().unwrap();
}
// handle task WITHOUT getting a read lock on every pass
// getting a lock is far costlier than actual task processing
}
drop(lock);
}
Since all threads will quickly hit out of space at about the same time, they can all release the read
lock, and get a write
. The first thread that gets the write
lock will fix the storage issue. But now I have a possible temporary deadlock situation - all other threads are also waiting for the write
lock even though they no longer need it.
So it is possible for this situation to happen: given 3 threads all waiting for write
, the 1st gets the write
, fixes the issue, releases write
, and waits for read
. The 2nd enters write
but quickly skips because issue already fixed and releases. The 1st and 2nd threads will enter read
and continue processing, but the 3rd is still waiting for write
and will wait for it for a very long time until the first two either run out of space or finish all their work.
Given all threads waiting for write
, how can I "abort" all other thread's waits from the first thread after it finishes its work, but before it releases the write
lock it already got?
I saw there is a poisoning
feature, but that was designed for panics, and reusing it for production seems wrong and tricky to get done correctly. Also Rust devs are thinking of removing it.
P.S. Each loop iteration is essentially a data[index] = value
assignment, where data
is a giant memmap shared by many threads. The index
is slowly growing in all threads, so eventually all threads run out of memmap size. When that happens, memmap is destroyed, file reallocated, and a new memmap is created. Thus, it is impossible to get a read lock on every loop iteration.