1

The libsodium library has a function

uint32_t randombytes_uniform(const uint32_t upper_bound);

but obviously this returns an unsigned integer. Can I somehow use this to generate a uniformly distributed random double in range [-a,a] where a is also a double given by the user ? I am especially focused on the result being uniformly distributed/unbiased, so that is why I would like to use the libsodium library.

mlg556
  • 419
  • 3
  • 13
  • What's wrong with rand()? – Fabio A. Correa Apr 25 '19 at 13:57
  • I have concerns about the guarantee of uniform distribution of it, it is important that the result generated is as unbiased as possible. – mlg556 Apr 25 '19 at 14:03
  • If you search for “random double”, you will find plenty of other questions about this. But your question is ill-defined. Floating-point formats are finer near zero than away from it. If you picked a real number from a uniform distribution over an interval and rounded it to the nearest value representable in `double`, some values would occur more frequently than others, because they are farther from their neighbors, and the span of real numbers that round to them is longer than for `double` values near zero. – Eric Postpischil Apr 25 '19 at 14:07
  • 1
    Additionally, the endpoints will have half the frequency of their neighbors inside the interval. Sometimes when selecting “random” floating-point values, people want to use a population of values that are evenly spaced, ignoring the finer values. It depends on the purpose to which the numbers are being put. What are you using the random numbers for? – Eric Postpischil Apr 25 '19 at 14:09
  • @EricPostpischil I fail to see how a mathematically simple task such as "choosing a random real number uniformly distributed in a range" can be ill-defined. I am not well-versed in topics like digital precision and representation of real numbers so is this task impossible? Also, I guess we could settle on using a "population of values that are evenly spaced, ignoring the finer values" if this means I can choose the "grain-size" and the result will be let's say a multiple of `0.001` and uniformly distributed in a range [-1.245, 3.523] ? – mlg556 Apr 25 '19 at 17:41
  • @mlg556 even getting [0...1] number is VERY non-trivial. Please check my answer and link therein for details. – Severin Pappadeux Apr 25 '19 at 17:51
  • @chux yeah, if unsure about issues, read Taylor R Campbell. I would appreciate if you look at the answer I provided – Severin Pappadeux Apr 25 '19 at 18:37
  • @mlg556: .001, −1.245, and 3.523 cannot be represented in the format commonly used for `double`. So it is not possible to have an evenly spaced population of values with those parameters. For example, suppose you wanted points in [⅓, ⅔] spaced by .01 in a format that used two decimal digits. You could use the points .34, .35, .36,… .64, .65, .66. But then most of the points are spaced .01 apart, while .34 is spaced .06666… from the ⅓ endpoint, and .66 is similarly spaced. If the points were selected in uniform to the real numbers in the interval they are nearer than other points… – Eric Postpischil Apr 25 '19 at 21:29
  • … in the population, then .34 and .66 will occur with a different frequency than the other numbers. You cannot squeeze a uniform distribution into a format not commensurate with the interval parameters. Something has to give. This happens with your decimal numbers .001, −1.245, and 3.523 because the floating-point format uses binary. The nearest representable values are 0.001000000000000000020816681711721685132943093776702880859375, -1.24500000000000010658141036401502788066864013671875, and 3.523000000000000131450406115618534386157989501953125. – Eric Postpischil Apr 25 '19 at 21:32
  • Well, maybe you do not want a truly uniform distribution but just something that is good enough. (The deviations from uniform here could be made tiny, but, not too long ago, somebody pointed out a paper where surprising small deviations had a significant adverse effect in a security algorithm, one that could be exploited by attackers.) Even if you accept a non-uniform distribution, there are some issues to work out. How do you want to handle the endpoints? Not only can the non-uniformity be greater there, but look at those nearest representable values—they are strictly outside the interval. – Eric Postpischil Apr 25 '19 at 21:36
  • So would you want a solution that is closer to uniform but may return a point slightly outside the interval? In some algorithms, that could break the code, as it might cause an array access out of bounds. So maybe you want a solution that only returns points strictly inside the interval. This is one of the reasons I asked what you are using the random numbers for. Answering that could illuminate the actual needs and therefore clarify the problem. – Eric Postpischil Apr 25 '19 at 21:37

3 Answers3

1
const uint32_t mybound = 1000000000; // Example
const uint32_t x = randombytes_uniform(mybound);
const double a = 3.5; // Example
const double variate = a * ( (2.0 * x / mybound) - 1);
Fabio A. Correa
  • 1,968
  • 1
  • 17
  • 26
  • I would also like the bound to be a double, so for example the result is in the range [-3.5,+3.5], is that possible ? – mlg556 Apr 25 '19 at 13:58
  • Yes, I put a couple numbers as an example for you. Please consider to click to accept as an answer, thanks. – Fabio A. Correa Apr 25 '19 at 14:00
  • what is the significance of the value of the variable `mybound`, can I set it to UINT_MAX for example ? – mlg556 Apr 25 '19 at 14:02
  • 2
    This is bogus. To make 64bit double with 53bit mantissa you use only 32 random bits. How could you guarantee to be precisely within [-a...a] and with uniform coverage? – Severin Pappadeux Apr 25 '19 at 16:12
1

Let me try to to do it step-by-step.

First, you obviously need to combine two calls to get up to 64bit of randomness for one double value output.

Second, you convert it to [0...1] interval. There are several ways to do it, all of the are good in some sense or another, I prefer uniform random dyadic rationals in the form n*2-53, see here for details. You could try other methods listed above as well. NB: methods in the link produce results in [0...1) range, I've tried to do acceptance/rejection to get closed [0...1] range.

Last, I scale result into desired range.

Sorry, C++ only but it is trivial to convert to C

#include <stdint.h>
#include <math.h>
#include <iostream>
#include <random>

// emulate libsodium RNG, valid for full 32bits result only!
static uint32_t randombytes_uniform(const uint32_t upper_bound) {
    static std::mt19937 mt{9876713};
    return mt();
}

// get 64bits from two 32bit numbers
static inline uint64_t rng() {
    return (uint64_t)randombytes_uniform(UINT32_MAX) << 32 | randombytes_uniform(UINT32_MAX);
}

const  int32_t bits_in_mantissa = 53;
const uint64_t max  = (1ULL << bits_in_mantissa);
const uint64_t mask = (1ULL << (bits_in_mantissa+1)) - 1;

static double rnd(double a, double b) {
    uint64_t r;
    do {
        r = rng() & mask; // get 54 random bits, need 53 or max
    } while (r > max);
    double v = ldexp( (double)r, -bits_in_mantissa ); // http://xoshiro.di.unimi.it/random_real.c
    return a + (b-a)*v;
}

int main() {
    double a = -3.5;
    double b = 3.5;

    for(int k = 0; k != 100; ++k)
        std::cout << rnd(a, b) << '\n';

    return 0;
}
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
1

First recognizing that finding a random number [0...a] is a sufficient step, followed by a coin flip for +/-.

Step 2. Find the expo such that a < 2**expo or ceil(log2(a)).

int sign;
do {
  int exp;
  frexp(a, &exp);

Step 3. Form an integral 63-bit random number [0...0x7FFF_FFFF_FFFF_FFFF] and random sign. The 63 should be at least as wide as the precision of a double - which is often 53 bits. At this point r is certainly uniform.

  unit64_t r = randombytes_uniform(0xFFFFFFFF);
  r <<= 32;
  r |= randombytes_uniform(0xFFFFFFFF);
  // peel off one bit for sign
  sign = r & 1;
  r >>= 1; 

Step 4. Scale and test if in range. Repeat as needed.

  double candidate = ldexp(r/pow(2 63), expo);
} while (candidate > a);

Step 5. Apply the sign.

if (sign) {
  candidate = -candidate;
}
return candidate;

Avoid (2.0 * x / a) - 1 as the calculation is not symmetric about 0.0.


Code would benefit with improvements to deal with a near DBL_MAX.

Some rounding issues apply that this answer glosses over, yet the distribution remains uniform - except potentially at the edges.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256