Better way to shuffle two related lists

Question

Is there better ways to randomly shuffle two related lists without breaking their correspondence in the other list? I've found related questions in numpy.array and c# but not exactly the same one.

As a first try, a simple zip trick will do:

import random
a = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
b = [2, 4, 6, 8, 10]
c = zip(a, b)
random.shuffle(c)
a = [e[0] for e in c]
b = [e[1] for e in c]
print a
print b

It will get the output:

[[1, 2], [7, 8], [3, 4], [5, 6], [9, 10]]
[2, 8, 4, 6, 10]

Just find it a bit awkward. And it also need an additional list as well.

you can use `zip` to unzip the lists as well: `a,b = zip(*c)` — mgilson, Aug 01 '12 at 18:16
I would also generally not recommend a program design where you need to keep a set of parallel lists. Just keep 1 list. Create some sort of class or something to unify your data. — mgilson, Aug 01 '12 at 18:19
If one want to do such work by `numpy`, here is a good solution:http://stackoverflow.com/questions/4601373/better-way-to-shuffle-two-numpy-arrays-in-unison — Mithril, Apr 14 '16 at 08:24

kojiro · Accepted Answer · 2019-07-19T13:48:53.510

52

Given the relationship demonstrated in the question, I'm going to assume the lists are the same length and that list1[i] corresponds to list2[i] for any index i. With that assumption in place, shuffling the lists is as simple as shuffling the indices:

 from random import shuffle
 # Given list1 and list2

 list1_shuf = []
 list2_shuf = []
 index_shuf = list(range(len(list1)))
 shuffle(index_shuf)
 for i in index_shuf:
     list1_shuf.append(list1[i])
     list2_shuf.append(list2[i])

edited Jul 19 '19 at 13:48

answered Aug 01 '12 at 18:15

kojiro

74,557
19
143
201

16

As a fan of list comprehensions: list1_shuf = [list1[i] for i in index_shuf] – Tobias Domhan Nov 02 '13 at 17:06
1

@kojiro : doesn't matter : n*append_ops + n*append_ops = n*(append_ops+append_ops) = 2*n*append_ops – Lazik Dec 12 '13 at 14:01

score 34 · Answer 2 · edited Apr 05 '23 at 07:49

34

If you are willing to install a few more packages:

Req: NumPy (>= 1.6.1), SciPy (>= 0.9).

pip install -U scikit-learn

from sklearn.utils import shuffle
list_1, list_2 = shuffle(list_1, list_2, random_state = 0)

edited Apr 05 '23 at 07:49

hafiz031

2,236
3
26
48

answered Jan 14 '17 at 10:47

Tihomir Nedev

462
4
5

score 6 · Answer 3 · answered Aug 01 '12 at 18:16

If you have to do this often, you could consider adding one level of indirection by shuffling a list of indexes.

Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 bit (AMD64)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import random
>>> a = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
>>> b = [2, 4, 6, 8, 10]
>>> indexes = range(len(a))
>>> indexes
[0, 1, 2, 3, 4]
>>> random.shuffle(indexes)
>>> indexes
[4, 1, 2, 0, 3]
>>> for index in indexes:
...     print a[index], b[index]
...
[9, 10] 10
[3, 4] 4
[5, 6] 6
[1, 2] 2
[7, 8] 8

score 4 · Answer 4 · edited Nov 27 '18 at 08:17

4

So far, all solutions created new lists in order to solve the problem. If the lists a and b are very long you may want to shuffle them in place. For that you would need a function like:

import random

def shuffle(a,b):
    assert len(a) == len(b)
    start_state = random.getstate()
    random.shuffle(a)
    random.setstate(start_state)
    random.shuffle(b)

a = [1,2,3,4,5,6,7,8,9]
b = [11,12,13,14,15,16,17,18,19]
shuffle(a,b)
print(a) # [9, 7, 3, 1, 2, 5, 4, 8, 6]
print(b) # [19, 17, 13, 11, 12, 15, 14, 18, 16]

edited Nov 27 '18 at 08:17

smttsp

4,011
3
33
62

answered Apr 17 '18 at 16:36

AlexConfused

801
1
10
15

My best answer for large list without numpy! – Sangwon Kim Apr 12 '23 at 04:48

score 2 · Answer 5 · edited Nov 28 '17 at 15:18

A fast answer using numpy please refer to here:
You can use

p = numpy.random.permutation(len(a))

To create a new list of indexes for both lists and use it to reorder them.

In your scenario:

In [61]: a = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
In [62]: b = [2, 4, 6, 8, 10]
In [63]: import numpy as np
In [64]: a_ar, b_ar = np.array(a), np.array(b)
In [65]: p = np.random.permutation(len(a))
In [66]: a, b = a_ar[p].tolist(), b_ar[p].tolist()
In [68]: a
Out[68]: [[3, 4], [7, 8], [5, 6], [1, 2], [9, 10]]
In [69]: b
Out[69]: [4, 8, 6, 2, 10]

score 0 · Answer 6 · answered Oct 16 '19 at 09:56

0

You can do an unzip at the end to limit the awkwardness a bit?

import numpy as np
list1 = [1,2,3]
list2 = [4,5,7]
list_zipped = list(zip(list1,list2))
np.random.shuffle(list_zipped)
list1,list2 = zip(*z) #unzipping

answered Oct 16 '19 at 09:56

Arun

180
11

score -1 · Answer 7 · answered Aug 01 '12 at 18:17

I'm not sure if I'm missing something here, but it looks like you're just shuffling 1 of the lists and the other one is re-arranged to match the order of the first list. So what you have is the best way to do this without making it more complicated. If you want to go the complicated route you can just shuffle 1 list and use the unshuffled list to do a lookup in the shuffled list and rearrange it in that way. In the end you end up with the same result you started with. Why is creating a third list a problem? If you really want to recycle the lists then you can simply replace list b with what you're using for list c and then separate it later on back to a and b.

Spherical Cowboy · Answer 8 · 2022-03-30T14:58:35.190

-1

Modified version of AlexConfused's approach which is more general and be copied and used directly:

from random import shuffle, getstate, setstate


def shuffle_inplace(lst, state):
    """ shuffle multiple lists in-place using order determined by state """
    setstate(state)
    shuffle(lst)


lst1 = [0, 1, 2, 3, 4]
lst2 = [5, 6, 7, 8, 9]

s = getstate()
shuffle_inplace(lst1, s)
shuffle_inplace(lst2, s)

print(lst1)
print(lst2)

edited Mar 30 '22 at 14:58

answered Mar 29 '22 at 16:08

Spherical Cowboy

565
6
14

You probably want to include a link to their answer. https://stackoverflow.com/a/49883307/843953 Also, your version isn't more general IMO, it just moves the capture of the RNG state outside the function. Sure, you can now shuffle any number of lists using this RNG state, but it misses the entire point of the question which is to shuffle lists simultaneously. To generalize their approach you could have your function take _any number of lists_ instead of only two lists. – Pranav Hosangadi Mar 29 '22 at 16:13
I tried to edit the poster's answer before, but the edit queue was too long, hence I posted an answer myself. I added a direct link to the original answer as well now. The question was how "to randomly shuffle two related lists without breaking their correspondence in the other list" without, e.g., zipping, shuffling and extracting lists again. There was no mention of doing so simultaneously. Maybe you're confusing this question with a related one? I needed this precisely because I had to randomly shuffle N lists without breaking correspondence in a succinct and convenient fashion. – Spherical Cowboy Mar 30 '22 at 15:05

Better way to shuffle two related lists

8 Answers8

Linked

Related