0

I wish to create an IndexedParallelIterator<Item = [I::Item; N]> from an array [I; N] where I: IndexedParallelIterator. Let's call it ParConstZip<I, N>.

As I understand, there's a couple of steps here:

  1. Create the corresponding non-parallel ConstZip<I, N> iterator (which must implement ExactSizeIterator and DoubleEndedIterator).
  2. Create a Producer to split the input and create the ConstZip<I, N> iterators.
  3. Implement ParallelIterator and IndexedParallelIterator for ParConstZip<I, N> using the Producer.

I think I'm alright on points 1. and 2. Specifically, here's the ConstZip<I, N> implementation:

pub struct ConstZip<I, const N: usize>([I; N]);

impl<I, const N: usize> Iterator for ConstZip<I, N>
where
    I: Iterator,
{
    type Item = [I::Item; N];

    fn next(&mut self) -> Option<Self::Item> {
        let mut dst = MaybeUninit::uninit_array();

        for (i, iter) in self.0.iter_mut().enumerate() {
            dst[i] = MaybeUninit::new(iter.next()?);
        }

        // SAFETY: If we reach this point, `dst` has been fully initialized
        unsafe { Some(MaybeUninit::array_assume_init(dst)) }
    }
}

impl<I, const N: usize> ExactSizeIterator for ConstZip<I, N>
where
    I: ExactSizeIterator,
{
    fn len(&self) -> usize {
        self.0.iter().map(|x| x.len()).min().unwrap()
    }
}

impl<I, const N: usize> DoubleEndedIterator for ConstZip<I, N>
where
    I: DoubleEndedIterator,
{
    fn next_back(&mut self) -> Option<Self::Item> {
        let mut dst = MaybeUninit::uninit_array();

        for (i, iter) in self.0.iter_mut().enumerate() {
            dst[i] = MaybeUninit::new(iter.next_back()?);
        }

        // SAFETY: If we reach this point, `dst` has been fully initialized
        unsafe { Some(MaybeUninit::array_assume_init(dst)) }
    }
}

And here's what I believe to be an appropriate producer:

pub struct ParConstZipProducer<P, const N: usize>([P; N]);

impl<P, const N: usize> Producer for ParConstZipProducer<P, N>
where
    P: Producer,
{
    type Item = [P::Item; N];
    type IntoIter = ConstZip<P::IntoIter, N>;

    fn into_iter(self) -> Self::IntoIter {
        ConstZip(self.0.map(Producer::into_iter))
    }

    fn split_at(self, index: usize) -> (Self, Self) {
        let mut left_array = MaybeUninit::uninit_array();
        let mut right_array = MaybeUninit::uninit_array();

        for (i, producer) in self.0.into_iter().enumerate() {
            let (left, right) = producer.split_at(index);
            left_array[i] = MaybeUninit::new(left);
            right_array[i] = MaybeUninit::new(right);
        }

        // SAFETY: Arrays are guaranteed to be fully initialised at length `N`
        let left_array = unsafe { MaybeUninit::array_assume_init(left_array) };
        let right_array = unsafe { MaybeUninit::array_assume_init(right_array) };

        (
            ParConstZipProducer(left_array),
            ParConstZipProducer(right_array),
        )
    }
}

However, I stumble when it comes to the actual implementation of IndexedParallelIterator. Most of it seems to be boilerplate, but the with_producer method I cannot figure out how to implement:

pub struct ParConstZip<I, const N: usize>([I; N]);

impl<I, const N: usize> ParallelIterator for ParConstZip<I, N>
where
    I: IndexedParallelIterator,
{
    type Item = [I::Item; N];

    fn drive_unindexed<C>(self, consumer: C) -> C::Result
    where
        C: UnindexedConsumer<Self::Item>,
    {
        bridge(self, consumer)
    }
}

impl<I, const N: usize> IndexedParallelIterator for ParConstZip<I, N>
where
    I: IndexedParallelIterator,
{
    fn drive<C>(self, consumer: C) -> C::Result
    where
        C: Consumer<Self::Item>,
    {
        bridge(self, consumer)
    }

    fn len(&self) -> usize {
        self.0.iter().map(|x| x.len()).min().unwrap()
    }

    fn with_producer<CB>(self, callback: CB) -> CB::Output
    where
        CB: ProducerCallback<Self::Item>,
    {
        todo!()
    }
}

Playground link.

I needed the same thing about a year ago, and asked a question on the users.rust-lang forum. I got some helpful high-level pointers from one of the authors of rayon, but after reading the rayon plumbing README a number of times both then and now, I have to admit I'm still a bit lost.

MSR
  • 2,731
  • 1
  • 14
  • 24
  • DO NOT override `ExactSizeIterator::len()` like this. The size should be calculated and returned in `Iterator::size_hint()`. You can override `len()` too, if you can provide a more performant implementation (that makes sense here, since you're in a generic context), but **always** implement `Iterator::size_hint()` too. – Chayim Friedman Jun 12 '22 at 23:46
  • Also, just in case, you can replace the unstable `MaybeUninit` methods with their implementation to stay on stable land. – Chayim Friedman Jun 12 '22 at 23:47
  • Thanks for the tips. To be clear, here's another [playground link](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f7275fb7b453cfb10418bdc0c4bba7b8) with a better `size_hint`/`len`, and with all unsafe code removed. I understand that this is not performant, but I'm mainly interested in the `rayon` specifics. – MSR Jun 13 '22 at 10:27
  • You may be overthinking the implementation of `IndexedParallelIterator`. It really ought to be boiler-plate; it's just working around a lifetime limitation with closure syntax. You should be able to copy that and just add fields to the struct for things that the closure will need. – Peter Hall Jun 13 '22 at 13:18
  • Thanks, that's also what I gather from the plumbing README, yet I still get lost in the weeds somewhere with all the traits and bounds. If you fancy fleshing it out for an answer a little bit, that would be very appreciated. – MSR Jun 13 '22 at 13:20

0 Answers0