The basic issue occurs because random.shuffle
uses the following (code can be found here) -
x[i], x[j] = x[j], x[i]
If you do this kind of assignment for Numpy array (like in your case) , you get the issue -
In [41]: ll
Out[41]:
array([[7, 8],
[5, 6],
[1, 2],
[3, 4]])
In [42]: ll[0] , ll[1] = ll[1] , ll[0]
In [43]: ll
Out[43]:
array([[5, 6],
[5, 6],
[1, 2],
[3, 4]])
The following example may be able to show why the issue occurs -
In [63]: ll = np.array([[1,2],[3,4],[5,6],[7,8]])
In [64]: ll[0]
Out[64]: array([1, 2])
In [65]: x = ll[0]
In [66]: x
Out[66]: array([1, 2])
In [67]: y = ll[1]
In [68]: y
Out[68]: array([3, 4])
In [69]: ll[1] = x
In [70]: y
Out[70]: array([1, 2])
As you can see when you set ll[1]
to a new value, y
variable reflected the change as well, this is most probably because numpy might be mutating ll[1]
inplace (please note, I am not talking about ll
, ll[1]
the inner ndarray
of ll
) instead of assigning the object referenced by x
to ll[1]
(like it happens in case of lists) .
As a solution you can use np.random.shuffle()
instead of random.shuffle()
-
In [71]: ll = np.array([[1,2],[3,4],[5,6],[7,8]])
In [72]: ll
Out[72]:
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
In [73]: from numpy.random import shuffle
In [74]: shuffle(ll)
In [75]: ll
Out[75]:
array([[7, 8],
[3, 4],
[1, 2],
[5, 6]])
Please do note, np.random.shuffle()
only shuffles elements along the first index of a multi-dimensional array. (Though if random.shuffle()
worked , it would have worked like that as well) .