Strange behaviour when using random.shuffle on numpy.array

Question

import numpy as np
from random import shuffle

a = np.array([[1,2,3],[4,5,6]])
print(a)
for i in range(1000):
    shuffle(a)
print(a)

Output:

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [1 2 3]]

Get two [1, 2, 3], [4, 5, 6] is missing.

import numpy as np
from random import shuffle

a = np.arange(10).reshape(5,2)
print(a)
for i in range(1000):
    shuffle(a)
print(a)

Output:

[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]
[[0 1]
 [0 1]
 [0 1]
 [0 1]
 [0 1]]

All items in the array become first item.

score 3 · Answer 1 · edited May 23 '17 at 11:44

3

numpy.random.shuffle(a) should do what you want. Playing around with random.shuffle on the terminal, it doesn't seem to handle matrices very well.

Shuffle a numpy array

edited May 23 '17 at 11:44

Community

1
1

answered Sep 18 '15 at 04:54

user1373945

61
3

score 3 · Accepted Answer · answered Sep 18 '15 at 04:55

The basic issue occurs because random.shuffle uses the following (code can be found here) -

x[i], x[j] = x[j], x[i]

If you do this kind of assignment for Numpy array (like in your case) , you get the issue -

In [41]: ll
Out[41]:
array([[7, 8],
       [5, 6],
       [1, 2],
       [3, 4]])

In [42]: ll[0] , ll[1] = ll[1] , ll[0]

In [43]: ll
Out[43]:
array([[5, 6],
       [5, 6],
       [1, 2],
       [3, 4]])

The following example may be able to show why the issue occurs -

In [63]: ll = np.array([[1,2],[3,4],[5,6],[7,8]])

In [64]: ll[0]
Out[64]: array([1, 2])

In [65]: x = ll[0]

In [66]: x
Out[66]: array([1, 2])

In [67]: y = ll[1]

In [68]: y
Out[68]: array([3, 4])

In [69]: ll[1] = x

In [70]: y
Out[70]: array([1, 2])

As you can see when you set ll[1] to a new value, y variable reflected the change as well, this is most probably because numpy might be mutating ll[1] inplace (please note, I am not talking about ll , ll[1] the inner ndarray of ll ) instead of assigning the object referenced by x to ll[1] (like it happens in case of lists) .

As a solution you can use np.random.shuffle() instead of random.shuffle() -

In [71]: ll = np.array([[1,2],[3,4],[5,6],[7,8]])

In [72]: ll
Out[72]:
array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

In [73]: from numpy.random import shuffle

In [74]: shuffle(ll)

In [75]: ll
Out[75]:
array([[7, 8],
       [3, 4],
       [1, 2],
       [5, 6]])

Please do note, np.random.shuffle() only shuffles elements along the first index of a multi-dimensional array. (Though if random.shuffle() worked , it would have worked like that as well) .

Strange behaviour when using random.shuffle on numpy.array

2 Answers2