3

I have a WriterJob struct. Its functionality is not important; what it does is that it spawns a long-running task and provides callers an api to check the status of the job.

pub(crate) struct WriterJob {
    ...,
    status: Status,
}

impl WriterJob {
    async fn writer_job_impl(&mut self, rx: Receiver<WriteCommand>) {
       // Job implementation details. It's important that `self` is mutuable
       // since I update the job status
       // eg.
       // self.status = Status::Ready;
    }

    pub(crate) fn status(&self) -> Status {
       self.status
    }

    pub(crate) fn spawn_writer_job(&mut self) -> Sender<WriteCommand> {
        let (tx, rx) = mpsc::channel(10);

        let handle = tokio::spawn(async move {
            self.writer_job_impl(rx).await;
        });

        self.status = Status::Spawned;

        tx
    }

I get this error:

error[E0521]: borrowed data escapes outside of associated function
  --> src/io/buffered_write.rs:92:22
   |
89 |       pub(crate) fn spawn_writer_job(&mut self) -> Sender<WriteCommand> {
   |                                      ---------
   |                                      |
   |                                      `self` is a reference that is only valid in the associated function body
   |                                      let's call the lifetime of this reference `'1`
...
92 |           let handle = tokio::spawn(async move {
   |  ______________________^
93 | |             self.writer_job_impl(rx).await;
94 | |         });
   | |          ^
   | |          |
   | |__________`self` escapes the associated function body here
   |            argument requires that `'1` must outlive `'static`

I think I understand that the compiler is complaining that it doesn't know whether self will live as long as the spawned task hence a lifetime error. But I'm not sure what a good approach to this is. One possibility is to use Arc<Mutex<Status>> or Arc<RwLock<Status>> but I don't like this approach because I will probably need to add more mutuable fields in self. Are there cleaner ways to do this?

danronmoon
  • 3,814
  • 5
  • 34
  • 56
Kiwi breeder
  • 459
  • 4
  • 11
  • 1
    Also consider, how does the compiler enforce that you don't borrow `self` again while the spawned task is running? It doesn't know how long the task will run for so the only way to make that work is by _moving_ `self` or by using interior mutability. – Peter Hall Apr 29 '23 at 19:57
  • Depending on your needs, you might also be able to use [scoped tasks](https://docs.rs/tokio-scoped/latest/tokio_scoped/). – Peter Hall Apr 29 '23 at 20:52
  • Using `pub(crate)` on a `struct` that is already set that way seems a bit redundant. – tadman Apr 30 '23 at 02:12
  • 2
    Extract `status` to a separate state struct, and wrap it with `Arc`. You want to read and write it from different threads, so you've got to sync it somehow. If you don't want to have a mutex, switch to atomics or atomic enums. – xamgore Apr 30 '23 at 05:53

2 Answers2

1

You can't assume that the self isn't dropped during execution of the handle that's the why it throws the error. Instead, you can change the design of your structure to make it work

  1. Create a sub struct with data you need to share between threads
struct WriteJobInner {
    ...,
    status: Status,
}
  1. Create the struct to execute operations on shader data
// you can use 'std::sync::Mutex' if you need to touch the data in non-async
// functions
use tokio::sync::Mutex;
use std::sync::Arc;

#[derive(Clone)]
pub(crate) struct Writejob {
    inner: Arc<Mutex<WriteJobInner>>,
}

Also, probably you can replace the tokio's mutex with tokio::sync::RwLock which is much better if many threads try to perform immutable operation on the data which lets to many threads read the data without blocking other read-only threads.

  1. The final step is to implement WriteJob functions and you can use &self instead of &mut self because you modify data with protected with mutex.
impl WriterJob {
    async fn writer_job_impl(&self, rx: Receiver<WriteCommand>) {
       // Job implementation details. It's important that `self` is mutuable
       // since I update the job status
       // eg.
       // *self.inner.lock().await.status = Status::Ready;
    }

    pub(crate) async fn status(&self) -> Status {
       // make sure the `status` implements copy trait
       *self.inner.lock().await.status
    }

    pub(crate) fn spawn_writer_job(&self) -> Sender<WriteCommand> {
        let (tx, rx) = mpsc::channel(10);
        // creates a new instance `Self` but all the instances of `self.inner`
        // references to the same shared state because it's wrapped by the
        // shader pointer `Arc`
        let writer = self.clone();

        let handle = tokio::spawn(async move {
            writer.writer_job_impl(rx).await;
        });

        *self.inner.status.lock().await = Status::Spawned;

        tx
    }
}
0

One possible solution to the problem is to use interior mutability. By changing the Status field to a Cell<Status> - or an atomic type, crossbeam's crossbeam::atomic::AtomicCell comes to mind - you can remove the mut and lifetime restrictions.

Let's say you move your status field into an inner type; I called it WriterJobInner. The WriterJob now only owns an Arc<WriterJobInner>:

pub(crate) struct WriterJob {
    inner: std::sync::Arc<WriterJobInner>,
}

struct WriterJobInner {
    pub status: std::cell::Cell<Status>,
}

struct WriteCommand;

#[derive(Copy, Clone)]
enum Status {
    None,
    Spawned,
    Ready,
}

If you want, you can implement Deref for WriterJob to simplify some of the accesses.

Your WriterJob implementation would now change to this:

impl WriterJob {
    pub(crate) fn spawn_writer_job(&self) -> tokio::sync::mpsc::Sender<WriteCommand> {
        let (tx, rx) = tokio::sync::mpsc::channel(10);

        // Clone the Arc and move it to the background thread.
        let inner = self.inner.clone();
        let handle = tokio::spawn(async move {
            inner.writer_job_impl(rx).await;
        });

        // TODO: Set only to Spawned if still None
        self.inner.status.set(Status::Spawned);
        tx
    }

    pub(crate) fn status(&self) -> Status {
        self.inner.status()
    }
}

Since the thread doesn't need to pass self - neither mutably nor immutably - the error goes away. In addition, since status is now a Cell, it also does not require mut self.

Likewise, your WriterJobInner also only takes &self:

impl WriterJobInner {
    pub async fn writer_job_impl(&self, rx: tokio::sync::mpsc::Receiver<WriteCommand>) {
        // ...
        self.status.set(Status::Ready);
    }

    pub fn status(&self) -> Status {
        self.status.get()
    }
}

unsafe impl Send for WriterJobInner {}
unsafe impl Sync for WriterJobInner {}

On a downside, the WriterJobInner type is required to be both Send and Sync in order to be used with Arc<T>, but you're using it cross-thread anyway.

Note that setting the status to Spawned after the thread was created is a race condition. You way want to try to atomically set the value only if it wasn't already set to something else.


In this approach, the most useful mix is likely one where you bundle everything together that needs to be changed together and then use exterior mutability for it.

Using RwLock (or similar) above here, e.g. as inner: Arc<RwLock<..>>, requires that you either mix sync with async code (std::sync::RwLock in an async method) or make your accessors async (when using tokio::sync::RwLock).

xc wang
  • 318
  • 1
  • 3
  • 14
sunside
  • 8,069
  • 9
  • 51
  • 74