6

I'm trying to convert this line of matlab in C++: rp = randperm(p);

Following the randperm documentation:

randperm uses the same random number generator as rand

And in rand page:

rand returns a single uniformly distributed random number

So rand follows an uniform distribution. My C++ code is based on:

std::random_device rd;
std::mt19937 g(rd());
std::shuffle(... , ... ,g);

My question is: the code above follows an uniform distribution? If not, how to do so?

justHelloWorld
  • 6,478
  • 8
  • 58
  • 138
  • [`std::uniform_int_distribution`](http://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution#Example) – BoBTFish Jul 14 '16 at 07:28
  • At least `std::mt19937` should be uniform, [if you trust the original authors' paper title](https://dx.doi.org/10.1145%2F272991.272995). And `std::random_device` is just used to seed the Mersenne Twister, so it does not really have to be uniform in any way. – mindriot Jul 14 '16 at 07:43
  • @mindriot thanks for your answer. Than what is the difference compared with `std::uniform_int_distribution` described above? – justHelloWorld Jul 14 '16 at 07:45
  • 3
    `mt19937` is a random number _engine_, it produces "raw" random numbers (32-bit or 64-bit), and those should be as uniform as possible. `uniform_int_distribution` turns that random input into something matching a _particular_ distribution (e.g. uniform over {1, …, 6}). – mindriot Jul 14 '16 at 07:47
  • `shuffle` constructs own distribution inside, it just need a source of randomness: engine. `mt19937` is the best one in standard library. – Revolver_Ocelot Jul 14 '16 at 07:57
  • @Revolver_Ocelot If you use a biased generator to feed `shuffle`, would `shuffle` not also produce different permutations with different probabilities? If not, how does it construct uniform permutation probabilities when it does not know the characteristics of the generator? Ergo, the generator must be uniform, right? – mindriot Jul 14 '16 at 08:03
  • @mindriot distribution is needed to adapt raw 32/64 random bit generator output to arbitrary range without loss of uniformity (for example modulo operator is not suited for this). So yes, non-uniform generators translates into non-uniform distribution, because distributions are just user-friendly adapter, they are not magical. Although there are techniques which allows you to get uniform chances for all permutations even for biased generators, I doubt that they are actually used in practical shuffle implementations. – Revolver_Ocelot Jul 14 '16 at 08:19

1 Answers1

15

The different classes from the C++ random number library roughly work as follows:

  • std::random_device is a uniformly-distributed random number generator that may access a hardware device in your system, or something like /dev/random on Linux. It is usually just used to seed a pseudo-random generator, since the underlying device wil usually run out of entropy quickly.
  • std::mt19937 is a fast pseudo-random number generator using the Mersenne Twister engine which, according to the original authors' paper title, is also uniform. This generates fully random 32-bit or 64-bit unsigned integers. Since std::random_device is only used to seed this generator, it does not have to be uniform itself (e.g., you often seed the generator using a current time stamp, which is definitely not uniformly distributed).
  • Typically, you use a generator such as std::mt19937 to feed a particular distribution, e.g. a std::uniform_int_distribution or std::normal_distribution which then take the desired distribution shape.
  • std::shuffle, according to the documentation,

    Reorders the elements in the given range [first, last) such that each possible permutation of those elements has equal probability of appearance.

In your code example, you use the std::mt19937 PRNG to feed std::shuffle. So, std::mt19937 is uniform, and std::shuffle should also behave uniformly. So, everything is as uniform as it can be.

mindriot
  • 5,413
  • 1
  • 25
  • 34