One could use Fisher-Yates shuffle to fundamentally re-implement random.shuffle()
to accept a first
and last
index as arguments, e.g.:
import random
def valid_index(i, n):
assert(-n <= i < n)
return i % n
def shuffle(seq, first=0, last=-1, rand_int_gen=None):
n = len(seq)
first = valid_index(first, n)
last = valid_index(last, n)
# use Fisher-Yates shuffle (Durstenfeld method)
if callable(rand_int_gen):
for i in range(first, last):
j = rand_int_gen(i, last)
seq[i], seq[j] = seq[j], seq[i]
else:
getrandbits = random.getrandbits
for i in range(first, last + 1):
size = last - i + 1
j = getrandbits(size.bit_length()) % size + i
seq[i], seq[j] = seq[j], seq[i]
return seq
to be used like:
l = list(range(20))
print(l)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
random.seed(0) # just to show reproducible results
shuffle(l, 0, 9)
print(l)
# [6, 7, 2, 5, 8, 4, 9, 3, 0, 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
Time-wise this is actually even a few percent faster than random.shuffle()
for shuffling the whole sequence.
This is faster essentially because it gets the random values directly from random.getrandbits()
which is the closest method exposed by random
for random integer generation, the others, e.g. randint()
and randrange()
eventually reducing to this.
These last two eventually use internally _getrandbelow()
which could be calling getrandbits()
more often the necessary.
for k in range(1, 7):
n = 10 ** k
print(n)
%timeit l = list(range(n)); random.shuffle(l)
%timeit l = list(range(n)); shuffle(l)
print()
10
100000 loops, best of 3: 6.16 µs per loop
100000 loops, best of 3: 3.85 µs per loop
100
10000 loops, best of 3: 54.3 µs per loop
10000 loops, best of 3: 28 µs per loop
1000
1000 loops, best of 3: 585 µs per loop
1000 loops, best of 3: 341 µs per loop
10000
100 loops, best of 3: 6.01 ms per loop
100 loops, best of 3: 3.56 ms per loop
100000
10 loops, best of 3: 71.7 ms per loop
10 loops, best of 3: 44.1 ms per loop
1000000
1 loop, best of 3: 815 ms per loop
1 loop, best of 3: 582 ms per loop
This approach was also suggested here, as pointed out by @usr2564301.
Unfortunately, I think there is no better approach for doing this operation in-place.