1

I have some multithreaded code that involves using a nested loop, where the inner one is run in parallel. "Shared" across each thread is a Sender that will return results. Notably Sender implements Send so there should be no problem cloning it and sending it using Rayon's for_each_with(). However, compiling this code:

use std::sync::mpsc::channel;
use rayon::prelude::*;

fn main(){
    let (sender, receiver) = channel();

    (0..5).for_each(|i|{
        (0..5).into_par_iter().for_each_with(&sender, |sender, j|{
            sender.send(i + j).unwrap();   
        });
    });
}

Gives me:

8 |         (0..5).into_par_iter().for_each_with(&sender, |sender, j|{
  |                                              ^^^^^^^ `Sender<_>` cannot be shared between threads safely
  |
  = help: the trait `Sync` is not implemented for `Sender<_>`
  = note: required because of the requirements on the impl of `Send` for `&Sender<_>`

(Playground)

Now I was thinking this might be because I'm trying to clone a reference, but if I move the actual sender into for_each_with() (ie for_each_with(sender, ...)), it will be consumed by the first iteration of the outer loop:

error[E0507]: cannot move out of `sender`, a captured variable in an `FnMut` closure

(Playground).

How can I implement this pattern in a way that satisfies the Rust compiler?

Migwell
  • 18,631
  • 21
  • 91
  • 160
  • just clone https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=93126213bb401a5c204ac916bf8fb516 – Stargateur Aug 28 '21 at 04:57
  • 1
    style warning, `for_each` is bad and make no sense, use for loop, the world thanks you. – Stargateur Aug 28 '21 at 05:00
  • I disagree, I find `for_each` much more readable as part of a long chain of iterator methods. And anyway I'm using `try_for_each` and `for_each_with` which add extra functionality not suited to a `for` loop. – Migwell Aug 28 '21 at 05:16
  • It's good that cloning fixes this, but I want to know a) why I need to do so, and b) doesn't that mean I'm cloning `n x m + m` times, n for the inner loop and m for the outer loop? Because `for_each_with` is supposed to clone the `init` argument meaning we have `n x m` clones, but calling clone directly adds another `m` clones. That seems wasteful – Migwell Aug 28 '21 at 05:18
  • I can't answer a question about sense when code doesn't make sense, yes the clone could be avoid if `for_each_with` return the init item. But it doesn't probably because your first loop doesn't make sense in read world case. I didn't use rayon and I don't like it much. also clone a sender is cheap. "And anyway I'm using try_for_each and for_each_with which add extra functionality not suited to a for loop." my advice was only for `for_each` from std. "why I need to do so" if you send `&sender` you ask to rayon to clone a reference and as compiler say it's not safe to share across thread. – Stargateur Aug 28 '21 at 05:41
  • was thinking about this issue, one solution could be to use `ToOwned` in rayon instead of `Clone` – Stargateur Aug 29 '21 at 04:14
  • It looks like `to_owned()` calls `clone()` behind the scenes. – Migwell Aug 29 '21 at 04:39
  • no nevermind you miss understand but anyway to_owned would face same problem cause a reference need sync to implement send and sender from std doesn't implement sync. – Stargateur Aug 29 '21 at 05:30

2 Answers2

2

AFAIK, Rayon use a threadpool, this mean that rayon require the item to implement Send cause it will send it to a thread. rayon doesn't clone for each item but for each thread:

fn for_each_with<OP, T>(self, init: T, op: OP) where
    OP: Fn(&mut T, Self::Item) + Sync + Send,
    T: Send + Clone,

We can see the OP take a &mut T not a T. This mean for_each_with() clone for each number of thread used not for the number of item produce.

Reference need to implement Sync to implement Send. Sender is define to not implement Sync. I don't know the detail of this choice anyway this mean &Sender can't be share between thread. I don't think there is solution to remove this constrain.

But if you want you could use crossbeam-channel that implement Sync:

use crossbeam_channel::unbounded; // 0.5.1
use rayon::prelude::*; // 1.5.1

fn main() {
    let (sender, _receiver) = unbounded();

    for i in 0..5 {
        (0..5).into_par_iter().for_each_with(&sender, |sender, j| {
            sender.send(i + j).unwrap();
        });
    }
}

would compile fine. Bonus is that crossbeam-channel claim to be faster. That said, cloning is totally fine for the std Sender:

use rayon::prelude::*;
use std::sync::mpsc::channel;

fn main() {
    let (sender, _receiver) = channel();

    for i in 0..5 {
        let sender = sender.clone();
        (0..5).into_par_iter().for_each_with(sender, |sender, j| {
            sender.send(i + j).unwrap();
        });
    }
}

That would indeed clone O(n) time more but Sender from std is mean to be clone a lot. (Actually it's probably just add one clone cause you choice to make a nested loop, the code probably doesn't clone the last one but just give it the last thread and so you just clone one too much as we can test)

Anyway, all your problems come from a strange situation, one should flatten the iteration like:

use rayon::prelude::*;
use std::sync::mpsc::channel;

fn main() {
    let (sender, _recever) = channel();

    (0..5)
        .into_par_iter()
        .flat_map_iter(|i| (0..5).map(move |j| (i, j)))
        .for_each_with(sender, |sender, (i, j)| {
            sender.send(i + j).unwrap();
        });
}

You could also consider not using Sender rayon is probably mean to be use with collect() and instead of sending the item you could just collect them at the end.

Stargateur
  • 24,473
  • 8
  • 65
  • 91
  • Lots of great explanations and solutions here, thanks. So if Rayon only clones once per thread, does that mean that the use of `.clone()` will only end up with `n + t` clones, where `n` is the number of iterations in the outer loop, and `t` is the number of threads? Because that's completely tolerable. – Migwell Aug 29 '21 at 07:38
  • @Migwell no it's `n * t` (probably) cause you do n iteration where rayon clone (t - 1) time but you clone yourself 1 time so `n * (t - 1 + 1)`, since you didn't really use the original sender, you clone one too much. Again AFAIK maybe rayon doesn't do this exactly. – Stargateur Aug 29 '21 at 08:41
  • @Migwell Actually we can test it a little https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=44eb3ffcdee07bd47b08735fee5bb615 – Stargateur Aug 29 '21 at 08:59
-2

Sender is made to be cloned, its literally in the docs. Each child thread has to have its own sender.

  • My problem is slightly more subtle than that. I realise that the sender needs to be cloned and am happy to do this, but `for_each_with` *does the cloning for me*, so any additional cloning that I do is needless. – Migwell Aug 29 '21 at 04:38