10

I've just come across an interesting question from ComputerGuru on Hacker News and no comment seems to give a convincing answer.

Why does mt_rand(1, PHP_INT_MAX) always return an odd number?

I am not the author of the original question.

http://3v4l.org/dMbat

for ($i=0;$i<10000;$i++)
{
    echo mt_rand(1, PHP_INT_MAX)."\n";
}

output:

8571620074060775425
7401021871338029057
4351677773593444353
1801559362708176897
7848614552286527489
...
PeeHaa
  • 71,436
  • 58
  • 190
  • 262
Pier-Alexandre Bouchard
  • 5,135
  • 5
  • 37
  • 72
  • 5
    Probably related to this, taken from [the PHP manual page for mt_rand](http://php.net/manual/en/function.mt-rand.php): *Caution: The distribution of mt_rand() return values is biased towards even numbers on 64-bit builds of PHP when max is beyond 2^32. This is because if max is greater than the value returned by mt_getrandmax(), the output of the random number generator must be scaled up.* – Simba Jul 24 '15 at 13:45
  • 4
    Because PHP_INT_MAX > mt_getrandmax(), [more details](https://www.reddit.com/r/lolphp/comments/3eaw98/mt_rand1_php_int_max_only_generates_odd_numbers/ctdhxha). – Sony Jul 24 '15 at 13:47

1 Answers1

6

PHP_INT_MAX here is 263-1 (64-bit signed int max).

However, mt_rand() doesn't handle values this large. The Mersenne twister internally generates 32-bit words, and PHP's mt_getrandmax() is only 231-1 (it throws away the highest bit).

To generate a value in your requested min to max range, mt_rand first gets the 0 to 231-1 random number, then scales it using this formula:

x = ((x / (mt_getrandmax() + 1)) * (max - min + 1)) + min;

(See the source of rand.c and php_rand.h.)

Basically it blindly scales the internally generated number to fit the overlarge range, without even raising a warning. Multiplying to fit the overlarge range generates a lot of zeroes in the low bits, then adding min (which is 1) makes the result odd.

The problem is more dramatic in hexadecimal, where you can see that the low 32 bits of each number are completely non-random:

for ($i = 0; $i < 10000; $i++) {
    printf("%016x\n", mt_rand(1, PHP_INT_MAX));
}

Output:

41e0449b00000001
53d33d7c00000001
6ec8855700000001
234140e000000001
13a4581900000001
77547beb00000001
35a0660a00000001
0d0cd44200000001
...

There is a note in the manual that tries to warn about this, although it understates the problem:

The distribution of mt_rand() return values is biased towards even numbers on 64-bit builds of PHP when max is beyond 232. This is because if max is greater than the value returned by mt_getrandmax(), the output of the random number generator must be scaled up.

(It says it's biased towards even numbers, but that's only true when min is even.)

Boann
  • 48,794
  • 16
  • 117
  • 146
  • 1
    Great explanation, thanks. Odd that PHP doesn't throw an error/exception if you pass it a value rather than `mt_getrandmax()`. – ceejayoz Jul 24 '15 at 15:57