I am currently working on optimizing the rust jpeg decoder crate using SIMD. In order to avoid long repetitions in the code, I would like to write a macro that generates the following matrix transposition code :
s = [
i32x8::new(s[0].extract(0),s[1].extract(0),s[2].extract(0),s[3].extract(0),s[4].extract(0),s[5].extract(0),s[6].extract(0),s[7].extract(0), ),
i32x8::new(s[0].extract(1),s[1].extract(1),s[2].extract(1),s[3].extract(1),s[4].extract(1),s[5].extract(1),s[6].extract(1),s[7].extract(1), ),
i32x8::new(s[0].extract(2),s[1].extract(2),s[2].extract(2),s[3].extract(2),s[4].extract(2),s[5].extract(2),s[6].extract(2),s[7].extract(2), ),
i32x8::new(s[0].extract(3),s[1].extract(3),s[2].extract(3),s[3].extract(3),s[4].extract(3),s[5].extract(3),s[6].extract(3),s[7].extract(3), ),
i32x8::new(s[0].extract(4),s[1].extract(4),s[2].extract(4),s[3].extract(4),s[4].extract(4),s[5].extract(4),s[6].extract(4),s[7].extract(4), ),
i32x8::new(s[0].extract(5),s[1].extract(5),s[2].extract(5),s[3].extract(5),s[4].extract(5),s[5].extract(5),s[6].extract(5),s[7].extract(5), ),
i32x8::new(s[0].extract(6),s[1].extract(6),s[2].extract(6),s[3].extract(6),s[4].extract(6),s[5].extract(6),s[6].extract(6),s[7].extract(6), ),
i32x8::new(s[0].extract(7),s[1].extract(7),s[2].extract(7),s[3].extract(7),s[4].extract(7),s[5].extract(7),s[6].extract(7),s[7].extract(7), ),
];
The macro should be able to generate the code for different matrix sizes (4 or 8).
I have tried several different approaches, but I never manage to get the macro to repeat n times an n-item pattern.
The most logical to me would be:
macro_rules! square {
(($($x:tt),*), ($($y:tt),*)) => {
[
$([
$( s[$x].extract($y) ),*
]),*
]
};
($($x:expr),*) => { square!( ($($x),*) , ($($x),*) ) };
}
but it fails with
error: attempted to repeat an expression containing no syntax variables matched as repeating at this depth