2

I have three questions about RNG.

The first is what data is available to use as a seed. I have always used the time but there must be other easily available seeds.

What easily available seeds are there in c++?

If I reseeded the RNG at random intervals based on the next value to come out of the RNG and chose the seed at random from the answers to question 1 would this create a pseudo random chain that was harder to predict and therefore more random ?

Lastly what is the best way in c++ to get a random number within a range ? I have been using the modulus operator but I want something that will be evenly spread among the range and not favour high or low as it is fro the AI decisions.

Skeith
  • 2,512
  • 5
  • 35
  • 57

4 Answers4

3
  1. It depends on what you need rand for. For a lot of uses, time is perfectly adequate. For others less so. I'll usually read four bytes from /dev/random on a Unix machine, reverting to time if /dev/random isn't available. A more effective solution would be to use a higher resolution timer, and to hash in things like the process or the machine id.

  2. Reseeding probably won't change things much unless you're using something like /dev/random to do it. Most of the other available values are pretty predictable.

  3. If RAND_MAX is a multiple of the range, modulo works fine. If it isn't, the only solution is to throw away values: you have a total of RAND_MAX + 1 values, and you need n. And there is no possible mapping that will map all of the RAND_MAX + 1 values to n and have the same number of inputs for each n. The usual solution is something like:

    int limit = (RAND_MAX + 1) - (RAND_MAX + 1) % n;
    int result = rand();
    while ( result >= limit )
        result = rand();
    return result % n;

(I'm supposing here that you're looking for a result in the range [0...n). And that RAND_MAX + 1 won't overflow.)

Finally, if you're worried about the quality of the random values, be aware that many implementations of rand() aren't particularly good. You might want to switch to one of the boost random generators.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • 1
    I believe reseeding is not a good idea. The random number generator code is designed to produce good results when seeded once. Reseeding it invalidates the assumption the authors made and will probably produce less randomness. – Jay Jun 22 '11 at 14:26
  • @Jay It depends on the source of the seed. If the source is truly random (`/dev/random`), then reseeding will add to the randomness; the random generator is less random than `/dev/random`. Ideally, in fact, you'd just use `/dev/random`, and forget about the generator, but `/dev/random` is to slow: it uses entropy, and entropy doesn't build up too quickly, so once you (or anyone) has read something like 64 bytes, following bytes can take a few seconds each. – James Kanze Jun 22 '11 at 14:49
  • @James, even that's not necessarily true. If you'd reseed *the entire state-vector* of the RNG, then sure, but if you (typically) just reseed with a 32-bit int, then even reseeding from `/dev/random/` may make things worse. After all, there are just 32 bits of randomness in each seed which then fully determine the subsequent values - i.e. once in about every `2^16` reseeds you can expect to see duplicate seeds and thus common subsequences - generally Not Good. – Eamon Nerbonne Jun 23 '11 at 06:55
  • Basically, if you need to "pimp your RNG" with occasional reseeding, you should be looking for a different RNG. – Eamon Nerbonne Jun 23 '11 at 06:56
  • @Eamon As I said, the ideal would be to just use `/dev/random`, and be done with it. It's a truly random source. Reseeding causes the random generator to start at some new random point. As for duplicate seeds---truly random numbers will show duplicates from time to time. – James Kanze Jun 23 '11 at 07:45
  • Yeah, using /dev/random would ideally be truly random; it's as good as you can expect anyhow. But the point isn't that any random sequence has duplicates, it's *much worse* when you reseed occasionally, since in that case you won't just occasionally have a single (unpredictable, unexploitable) duplicate, you'll have an entire sequence duplicated. That's just not a good idea. – Eamon Nerbonne Jun 23 '11 at 18:16
  • It's not a huge flaw; but the point is that if you want security, that's not going to give it; and if you want stochastic well-distributed noise, that's not going to give it *either*. i.e. just use `/dev/random/`, a decent PRNG, or *something* better than the questioners reseeding approach. – Eamon Nerbonne Jun 23 '11 at 18:19
2

You should take a look at boost::random.

Keep in mind:

  • Why do you need random numbers - for security, or for a stochastic process?
  • What do you mean by "more random"?
  • If you reseed according to an algorithm and that algorithm is more predictable than the underlying RNG, you're worse off than you started with. If your seeds are just 32-bit values, then even a truly random seed source can make things worse not better!
  • If you need to "mix-in" extra randomness from a random source, it may be better to do that via XOR: i.e. maintain a small pad of truly random numbers, and cyclically XOR them into your RNG's output - and instead of reseeding occasionally, regenerating that pad occasionally. That way, you don't throw away the RNG's valuable internal state. Alternatively, if you have access to RNG internals, use the real random source to occasionally twiddle some bits by a similar mechanism.

I expect you can simply use boost::mt19937, but it really depends on the application.

Eamon Nerbonne
  • 47,023
  • 20
  • 101
  • 166
  • iv yet to use boost and am weary about including just for a better RNG that rand(). the idea was to make sure the AI did not follow a pattern that was to predictable. – Skeith Jun 22 '11 at 12:22
  • I'm not sure what your AI does, but if this is for some game-like logic, then it's *highly* likely that the random numbers are so far removed from the in-game consequences that even a poor random number generator will not be *noticably* biased. – Eamon Nerbonne Jun 22 '11 at 12:27
  • `boost` is extremely useful, I'd recommond swallowing that pill and saving the development effort. There's lots of other goodness in boost too... – Eamon Nerbonne Jun 22 '11 at 12:28
  • ...and since it's almost all header-only, you can use it with very little installation effort: just unzip it, and add the path to the include path. – Eamon Nerbonne Jun 22 '11 at 12:29
1

One easily available seed for a RNG is any time function. The seed does not need to be random, as long as it is kind of different each time you start the program, that's good enough. Trying to make a pseudorandom number "more random" is a somewhat silly endeavor. If this is needed, then the generator is not worth its salt.
Also, unless seeding regularly with true random noise, you will not make the output any "more random" anyway, and if you do seed true random noise regularly, only the seeds will be truly random, the other values are still deterministic and alltogether have the same statistical properties that any other sequence generated by this generator has.

The usual implementation for getting numbers in a non-power-of-two range if a skewed distribution is not acceptable looks somewhat like:

range = high - low;

while((r = rand()) > range) {}

r += low;

Modulo and multiplication/division have the well-known skew and overflow problems.

Though, if it's for AI decision as you said, I daresay that nobody will notice if one result is 1% more likely than another. Therfore simply using modulo is probably good enough, has deterministic time, and is dead simple. Also note that you can always choose a range that works well with modulo.

Damon
  • 67,688
  • 20
  • 135
  • 185
  • 1
    If `range` is 6 (say to emulate a throwing a die), and `RAND_MAX` is 2147483647, or something close (which it should be on a good 32 bit implementation), then you're going to throw away a lot of values; you're loop might run a long time. – James Kanze Jun 22 '11 at 12:50
  • Yep, entirely correct. In such a case you'd probably want to mask out the `(next_larger_pow2 - 1)` bits first. I was merely describing the principle. It is not very desirable anyway unless you really really really need it, because the runtime in non-deterministic. There is a well-known name for that algorithm too, but I forgot it... probably something like "range rejection" or such. – Damon Jun 22 '11 at 12:57
  • You want the highest multiple of `range`, `<= RAND_MAX`. And I don't know of a name for the technique, either. – James Kanze Jun 22 '11 at 13:06
0

Time is good enough for most applications, if you need better random numbers then you should look into specific libraries that provide such functionality.

For range generation, the following doesn't skew the results:

int random = rand() * RANGE / RAND_MAX;

Šimon Tóth
  • 35,456
  • 20
  • 106
  • 151
  • I remember seeing in a similar question that that only works if range is exactly dividable by rand max. – Skeith Jun 22 '11 at 12:20
  • `rand() * RANGE` may overflow, such that division by `RAND_MAX` heavily favors numbers closer to zero. – Eamon Nerbonne Jun 22 '11 at 12:23
  • @Skeith More importantly, it likely overflows (although for some reason, a lot of implementations artificially set `RAND_MAX` to 32767). – James Kanze Jun 22 '11 at 12:23
  • @Skeith it's the other way around; this is only OK if `(RAND_MAX+1) % RANGE == 0` and only if RANGE*RAND_MAX doesn't overflow. – Eamon Nerbonne Jun 22 '11 at 12:25
  • @Eamon what would the highest you could set range before it overflowed, on average its 1 - 20. – Skeith Jun 22 '11 at 12:41