Efficiently apply different permutations for each row of a 2D NumPy array

Question

Given a matrix A, I want to apply different random shuffles for different row of A; for example,

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

becomes

array([[1, 3, 2],
       [6, 5, 4],
       [7, 9, 8]])

Of course we can loop through the matrix and make every row randomly shuffle; however iteration is slow and I am asking if there is more efficient way to do this.

Another answer [here](https://stackoverflow.com/questions/21010947/fast-column-shuffle-of-each-row-numpy). The comments there also suggest `apply_along_axis `. Another answer for columns is [here](https://stackoverflow.com/questions/26975807/efficient-way-to-shuffle-one-column-at-the-time-in-numpy-matrix) and [here](https://stackoverflow.com/questions/20546419/shuffle-columns-of-an-array-with-numpy) and [here](https://stackoverflow.com/questions/36272992/numpy-random-shuffle-by-row-independently) — Sheldore, Jun 10 '19 at 23:11
And one more [here for column as well](https://stackoverflow.com/questions/35646908/numpy-shuffle-multidimensional-array-by-row-only-keep-column-order-unchanged) — Sheldore, Jun 10 '19 at 23:15

cs95 · Accepted Answer · 2019-06-10T23:03:52.703

Picked up this neat trick from Divakar which involves randn and argsort:

np.random.seed(0)

s = np.arange(16).reshape(4, 4)
np.take_along_axis(s, np.random.randn(*s.shape).argsort(axis=1), axis=1)

array([[ 1,  0,  3,  2],
       [ 4,  6,  5,  7],
       [11, 10,  8,  9],
       [14, 12, 13, 15]])

For a 2D array, this can be simplified to

s[np.arange(len(s))[:,None], np.random.randn(*s.shape).argsort(axis=1)]

array([[ 1,  0,  3,  2],
       [ 4,  6,  5,  7],
       [11, 10,  8,  9],
       [14, 12, 13, 15]])

You can also apply np.random.permutation over each row independently to return a new array.

np.apply_along_axis(np.random.permutation, axis=1, arr=s)

array([[ 3,  1,  0,  2],
       [ 4,  6,  5,  7],
       [ 8,  9, 10, 11],
       [15, 14, 13, 12]])

Performance -

s = np.arange(10000 * 100).reshape(10000, 100) 

%timeit s[np.arange(len(s))[:,None], np.random.randn(*s.shape).argsort(axis=1)] 
%timeit np.apply_along_axis(np.random.permutation, 1, s)   

84.6 ms ± 857 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
842 ms ± 8.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I've noticed it depends on the dimensions of your data, make sure to test it out first.

Thanks! So if I got a 3D array and if I want to permute the last dimension, then I can do `np.take_along_axis(s, np.random.randn(*s.shape).argsort(axis=2), axis=2)`, right? — Tony, Jun 10 '19 at 23:28

score 0 · Answer 2 · answered Jun 10 '19 at 22:57

Codewise you can use numpy's apply_along_axis as

np.apply_along_axis(np.random.shuffle, 1, matrix)

but it doesn't seem to be more efficient than iterating at least for a 3x3 matrix, for that method I get

> %%timeit 
> np.apply_along_axis(np.random.shuffle, 1, test)
67 µs ± 1.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

while the iteration gives

> %%timeit
> for i in range(test.shape[0]):
>     np.random.shuffle(test[i])
20.3 µs ± 284 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

`apply_along_axis` is essentially just iterate over the 'other' axes. No speed promises. It makes iteration prettier for 3d and larger; does nothing for 2d. — hpaulj, Jun 10 '19 at 23:04

Efficiently apply different permutations for each row of a 2D NumPy array

2 Answers2