I want to iterate over integers in the range 0
to N-1
, where N
is a large number.
This can easily be done with for i in range(N):
.
However, I want to iterate the numbers in a random order. This can also easily be done using something like:
from random import shuffle
a = list(range(N))
shuffle(a)
for i in a:
do_something(i)
The problem with this approach is that it requires storing in memory the entire list of numbers.
(shuffle(range(N))
raises an error). This make it not practical for my purposes for large N
.
I would like to have an object which is an iterator (just like range(N)
), which does not store all numbers in the memory (again, just like range(N)
), and which iterates in a random order.
Now, when I say "random order" I really mean that the order is sampled from the uniform distribution over the set of all permutations of (0,1,...,N-1)
. I know that this number is potentially very large (N!
), and therefore if the iterator would need to represent which permutation it uses it would need to be very large in memory.
Therefore, I can settle on "random order" having the meaning of "looks like a uniform distribution although it is actually not", in some sense which I have not defined.
If I had such an iterator, this is how I would operate it:
a = random_order_range(N) # this object takes memory much smaller than then factorial of N
for i in a:
do_something(i)
Any ideas how this can be done?
EDIT1:
Actually, what I am really interested in is that the memory consumption will be even less than ~N
, if possible... Maybe something like O(k*N)
for some k
that could be much smaller than 1.