2

I am trying to turn the code below into a parallel iterator to speed up performance:

// something like this
string.split(" ").enumerate().into_par_iter().for_each(|(_, b)| {
    // do something
});

But Rayon doesn't support .into_par_iter() for the Enumerate struct. And being relatively new to Rust, I'm not sure how to fix this problem. Most of the other problems involve vectors, but it doesn't here as I am trying to do the following:

  1. Get a string: String::from("Lorem ipsum dolor sit amet")
  2. Using .split(" ").enumerate() turn it into a vector: vec!["Lorem", "ipsum", "dolor", sit", "amet"]

So how can I turn the code above run in parallel?

Pro Poop
  • 357
  • 5
  • 14
  • 4
    Rayon has a built-in parallel split, but [it works using divide-and-conquer](https://github.com/rayon-rs/rayon/blob/f45eee8fa49c1646a00f084ca78d362f381f1b65/src/split_producer.rs#L85) to provide better performance, so it doesn't know the index of each item. If you really need the indices, you might have to use `par_bridge()` for this one; otherwise, it would likely be much faster to use `string.par_split(' ')`. – Coder-256 Jan 09 '22 at 00:41

1 Answers1

1

You can use .par_bridge() to convert any iterator into a parallel iterator:

This creates a “bridge” from a sequential iterator to a parallel one, by distributing its items across the Rayon thread pool. This has the advantage of being able to parallelize just about anything, but the resulting ParallelIterator can be less efficient than if you started with par_iter instead. However, it can still be useful for iterators that are difficult to parallelize by other means, like channels or file or network I/O.

The resulting iterator is not guaranteed to keep the order of the original iterator.

use rayon::iter::ParallelBridge;

string.split(" ").enumerate().par_bridge().for_each(|(_, b)| {
    // do something
});
kmdreko
  • 42,554
  • 6
  • 57
  • 106