5

I am writing a library which employs a Vec<Vec<T>> type to store data in column-major order (each inner Vec represents a column). The user can create a Vec<Vec<T>> with any row and column length but all columns are constrained to be the same length.

I need to sometimes efficiently iterate through the Vec<Vec<T>> by row. I would like to not change the array type because most of the time I need to iterate "by column vector" (one full column vector at a time).

Unless I am missing something, Iterator::zip is not an option because I do not know the number of column vectors in advance. Itertools::izip and Itertools::multizip are also not viable.

This is my sample code:

let array = vec![vec![1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]];
let mut iterators: Vec<_> = array.iter().map(|x| x.iter()).collect();
for _ in 0..array[0].len() {
    let values: Vec<_> = iterators.iter_mut().map(|x| x.next().unwrap()).collect();
    dbg!(values);
}

Should I define a mutable values vector before starting the iterations to avoid allocations at each cycle, or will the compiler take care of this optimization anyway? What is the easiest method to find it myself?

Are there more efficient / idiomatic solutions?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
lucatrv
  • 725
  • 8
  • 14
  • 2
    I think you confuse between row and column, also see https://docs.rs/ndarray/0.12.1/ndarray/struct.ArrayBase.html – Stargateur Feb 18 '19 at 22:34
  • I don't think there is a smart way to do this, if your vector of vector is big, even transform it into a one vector will not improve cache performance, it would be better to change your algo to not need this feature or to inverse your row and column in your data. Without precise real situation is impossible to advice a correct solution. – Stargateur Feb 18 '19 at 22:41
  • 1
    See also the blog post [Memory layout of multi-dimensional arrays](https://eli.thegreenplace.net/2015/memory-layout-of-multi-dimensional-arrays/) or the Q&A [What is the recommended way to iterate a matrix over rows?](https://stackoverflow.com/q/21778374/155423). – Shepmaster Feb 19 '19 at 00:25
  • Seems like maybe you could do something with `Itertools::kmerge_by`? – trent Feb 19 '19 at 13:17
  • Once I have defined `iterators` as a vector of iterators, is there a smart way to convert it into an iterator of vectors? – lucatrv Feb 19 '19 at 22:20

3 Answers3

4

Once I have a vector of iterators, how can I convert it to an iterator of vectors?

There are two ways to create an Iterator: to use an existing iterator adapter or to implement a custom Iterator.

Let's take the second approach and define a custom Iterator type that takes a vector of iterators:

struct DynamicZip<I>
where I: Iterator {
    iterators: Vec<I>
}

and let's provide an Iterator implementation:

impl<I, T> Iterator for DynamicZip<I>
where I: Iterator<Item = T> {
    type Item = Vec<T>;
    fn next(&mut self) -> Option<Self::Item> {
        let output: Option<Vec<T>> = self.iterators.iter_mut().map(|iter| iter.next()).collect()
        output
    }
}

and we're done!

Going back to the original example

fn main() {
    let array = vec![vec![1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]];
    let iterators: Vec<_> = array.into_iter().map(|v| v.into_iter()).collect();
    let dz = DynamicZip { iterators: iterators };
    // use the Iterator type we just defined
    for column in dz {
        println!("{:?}", column)
    }
}

will produce the output

[1, 4, 7]
[2, 5, 8]
[3, 6, 9]
timotree
  • 1,325
  • 9
  • 29
  • That not more efficient. – Stargateur Feb 20 '19 at 01:24
  • Clarified that with my data model there is nothing better I can do regarding efficiency, I am going to use this solution for my code. IMHO `dynamiczip` could be considered for addition to the `itertools` crate besides `multizip`. – lucatrv Feb 20 '19 at 14:56
0

Your Vec<Vec<T>> is a column oriented matrix where each inner Vec is a column, threfore, it is possible to know the number of columns at run-time by simply performing an array.len() operation.

With both rows and columns, it is easier to create an Iterator. Here is an example:

fn main() {
    let matrix = vec![vec![1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]];

    let columns = matrix.len();
    let rows = matrix[0].len();
    // If you know the number of rows in advance. E.g.: In some constructor
    // let rows = 3;

    let iter = (0..rows).map(|row_idx| matrix.iter().flatten().skip(row_idx).step_by(columns));

    for (row_idx, row_values) in iter.enumerate() {
        for (column_idx, value) in row_values.enumerate() {
            println!("[{}, {}] = {}", row_idx, column_idx, value);
        }
    }
}
Caio
  • 3,178
  • 6
  • 37
  • 52
0

I have taken timotree's reply and generalized it to work in any circumstance where you might want to transpose a data structure:

pub struct TransposeIter<I, T>
where
    I: IntoIterator<Item = T>,
{
    iterators: Vec<I::IntoIter>,
}

pub trait TransposableIter<I, T>
where
    Self: Sized,
    Self: IntoIterator<Item = I>,
    I: IntoIterator<Item = T>,
{
    fn transpose(self) -> TransposeIter<I, T> {
        let iterators: Vec<_> = self.into_iter().map(|i| i.into_iter()).collect();
        TransposeIter { iterators }
    }
}

impl<I, T> Iterator for TransposeIter<I, T>
where
    I: IntoIterator<Item = T>,
{
    type Item = Vec<T>;
    fn next(&mut self) -> Option<Self::Item> {
        let output: Option<Vec<T>> = self.iterators.iter_mut().map(|iter| iter.next()).collect();
        output
    }
}

impl<I, T, Any> TransposableIter<I, T> for Any
where
    Any: IntoIterator<Item = I>,
    I: IntoIterator<Item = T>,
{
}

Now you can simply import this trait and use it like:

let array = vec![vec![1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]];
assert_eq!(array.transpose().next(), Some(vec![1,4,7]));
jhk999
  • 45
  • 6