1

I have a range of integers I want to iterate through. You can assume the sequence begins with 1 and ends with n, where n > 1. However, I do not want to iterate through them sequentially. The goal is is iterate through all of them in a random fashion. In addition, n can be very large, in the trillions possibly, so I cannot store the range in memory. Is there a way to do this?

I'm aware that there is a way to do this with an array, already in memory. Can something similar be done when you cannot store the entire range at once?

  • are you looking for an algorithm ?... language specific? ... – Grady Player May 05 '15 at 20:16
  • What do you mean by 'random' here - do you need any specific statistical properties? Or just a generalized "mix 'em all up"? – BadZen May 05 '15 at 20:17
  • If you aren't storing the entire array, you'll still need to at least store which ones you've already output, which eventually will be equivalent to storing the entire array. Unless you don't mind using the same number twice, in which case you are just looking for a random number generator. – chepner May 05 '15 at 20:18
  • 1
    One approach would be to pick a random reordering of the bits of n and just count binary in the usual way in the preimage, for example. But you wouldn't want to do that if you are depending on this randomness for security.. – BadZen May 05 '15 at 20:19
  • @chepner - Surely not. Do you need to store the entire range when you increment x+1? Of course not. Do you need to store the entire range to specify a given permutation of (0...n)? Also no. So, compose the two... – BadZen May 05 '15 at 20:20
  • @badzen that is kind of a cool idea... certainly not any a very random way to go, but kind of cool anyway – Grady Player May 05 '15 at 20:21
  • Well, it's "random" in some senses but not others. And there are lots of other simple or less simple ways for different other senses (gray codes, random permutation polynomials on Z_n, etc, etc)... depends on what OP is looking for. – BadZen May 05 '15 at 20:22
  • 1
    practically you only need to store one bit for each number though.... so if you had a range of 1-million you only need 125kB to represent the buffer. – Grady Player May 05 '15 at 20:22
  • Wow, lot of comments... I want an algorithm not necessarily tied to any language. Given n, I want to be able to iterate through each value at most once in an order other than sequential. So yes, @BadZen, a mix 'em up would suffice. I need each integer to appear only once. Ideally, the random function that determines the next integer is uniformly distributed from `1` to `n`. It doesn't have to be but I would prefer something that is not deterministic. –  May 05 '15 at 20:41
  • Well, then, what I suggested. An instance P gives you all of the numbers, once each, and in P[x] is uniform over (0,N). (P[x] given priors obviously not uniform on the "remaining" set, though.) – BadZen May 05 '15 at 20:44
  • If you want random and non-repeating you're going to _have_ to store the sequence in memory, otherwise you have no way of knowing if a number has been previous chosen. If you can determine if a number has been chosen logically then the sequence is not random. – D Stanley May 05 '15 at 20:48

1 Answers1

1

You can use Format-Preserving Encryption to encrypt a counter.

Pick a symmetric cipher that is crafted to encrypt the numbers up to N. (You can use the proposed scheme of AES-FFX) Then generate a random key with high entropy and start to encrypt the numbers 0, 1, 2, ... upwards. Since the encryption ensures a 1:1 mapping and a good encryption looks random, you'll end up with a sequence of non-repeating random numbers using only O(1) storage.

The usage of a block cipher in counter mode (CTR) is a well known technique to create a cryptographically secure pseudo-random number generator (CSPRNG). Section 10.2.1. of NIST SP 800-90A gives additional tips. The usage of AES as the underlying block cipher is recommended for stochastic simulation since the quality of the randomness does not show any statistical weaknesses.

The idea is anything but new and was already proposed on stackoverflow multiple times by Craig McQueen:

Also crypto.stackexchange has several threads about this:

Community
  • 1
  • 1
Thomas B.
  • 2,276
  • 15
  • 24