1

How can I shuffle structured array. numpy.random.shuffle does not seem to work. Further is it possible to shuffle only a given field say x in the following example.

import numpy as np
data = [(1, 2), (3, 4.1), (13, 77), (5, 10), (11, 30)]
dtype = [('x', float), ('y', float)]
data1=np.array(data, dtype=dtype)
data1
>>> array([(1.0, 2.0), (3.0, 4.1), (13.0, 77.0), (5.0, 10.0), (11.0, 30.0)], 
      dtype=[('x', '<f8'), ('y', '<f8')])

np.random.seed(10)
np.random.shuffle(data)
data
>>> [(13, 77), (5, 10), (1, 2), (11, 30), (3, 4.1)]
np.random.shuffle(data1)
data1
>>> array([(1.0, 2.0), (3.0, 4.1), (1.0, 2.0), (3.0, 4.1), (1.0, 2.0)], 
      dtype=[('x', '<f8'), ('y', '<f8')])

I understand that I can explicitly give the randomized index,

data1[np.random.permutation(data1.shape[0])]

but I want a in place shuffling.

imsc
  • 7,492
  • 7
  • 47
  • 69

2 Answers2

1

This was due to a numpy bug https://github.com/numpy/numpy/issues/4270 In Numpy 1.8.1 this has been resolved. Now it work as expected.

np.random.shuffle(data1)
data1
>>> array([(1.0, 2.0), (13.0, 77.0), (11.0, 30.0), (5.0, 10.0), (3.0, 4.1)], 
      dtype=[('x', '<f8'), ('y', '<f8')])
imsc
  • 7,492
  • 7
  • 47
  • 69
0

numpy.random.shuffle seemes to support multi-dimentional array. See doc here. The example on the doc indicates you can pass multi-dimentional array as a parameter.

So I don't know why your code doesn't work.

But there is another approach to do it. Like:

shuffledIndex = random.shuffle(xrange(len(data)))
shuffledData = data[shuffledIndex]

oops...

import random
shuffledIndex = random.sample(xrange(len(data1)), len(data1))
shuffledData = data1[shuffledIndex]
Kei Minagawa
  • 4,395
  • 3
  • 25
  • 43