Dedicated rayon pool to submit work and asynchronously wait for completion

Question

I have a rayon::ThreadPool which I want to use to perform CPU bound tasks outside of Tokio's runtime context. The CPU bound tasks are synchronous tasks.

The problem is spawn requires the closure to be 'static but I want to use the borrowed data instead of making owned copies.

Initially I thought scope would work (and the code compiles too), but it seems that it will block until the closure completes - which will defeat my purpose of using this pool i.e. to not block the tokio runtime.

How can I achieve this with rayon or any other threadpool implementation?

pub struct TaskPool {
    pool: ThreadPool,
}

impl TaskPool {
    pub fn new(num_threads: usize) -> Self {
        Self {
            pool: ThreadPoolBuilder::new()
                .num_threads(num_threads)
                .build()
                .unwrap(),
        }
    }

    pub async fn verify(&self, hash: &[u8], data: &[u8]) -> bool {
        let (tx, rx) = oneshot::channel();
        self.pool.spawn(|| {
            let ok = hash == calc_sha1(data);
            tx.send(ok).unwrap();
        });
        rx.await.unwrap()
    }
}

playground

I'm not sure that would be sound. What's to stop someone to start awaiting `verify()` with owned data, and then drop the future after `calc_sha1()` commences, but before it finishes? There is good reason why scope-based interfaces block. — user4815162342, May 30 '21 at 18:09
If you control the data type and insist on not making a copy, you could use `Arc`. Or more awkwardly, you could move the data into the closure and return it after the operation through the channel. — piojo, May 31 '21 at 05:25
@user4815162342 - I agree on the soundness issue. I generally see use cases where you have an async task that you want to run from sync context - so you make an executor and call `block_on` and wait for it to finish. My use case is opposite, I have sync work from async context that I want to run in a separate thread pool and *not* block but rather asynchronously wait for it to finish. — Gurwinder Singh, May 31 '21 at 06:29
@piojo - The awkward approach seemed too awkward. I ended up with `Arc` (for now). — Gurwinder Singh, May 31 '21 at 06:30
Awaiting sync from async is not uncommon at all, e.g. virtually all filesystem operations work like that, as so does interfacing to CPU-bound or "legacy" non-async blocking code. Since it requires the use of threads, the standard approach is to learn to live with the `'static` requirement and pass around owned data, if necessary faking it with `Arc`. This is not always possible, as your `verify()` example shows - there is no way to use an `Arc` to pass data to `calc_sha1()` without modifying the signature of `verify()` or copying `data`. — user4815162342, May 31 '21 at 07:18

Dedicated rayon pool to submit work and asynchronously wait for completion

0 Answers0