11

Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)

In the book of Introduction to algorithms, there is an excise:

Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)

For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b

My solution is like this:

for i = 1 to b-a
    r = a + Random(0,1)
return r

the running time is T=b-a

Is this correct? Are the results of my solutions uniformly distributed?

Thanks

What if my new solution is like this:

r = a
for i = 1 to b - a //including b-a
    r += Random(0,1)
return r

If it is not correct, why r += Random(0,1) makes r not uniformly distributed?

Community
  • 1
  • 1
Jackson Tale
  • 25,428
  • 34
  • 149
  • 271
  • 3
    Your solution is not uniformly distributed. As an example the lowest value `a` can only be "calculated" by the sum of random(0)+random(0)+random(0)+.... however the probability of a value in "the middle" is higher because it can be calculated as 0+0+0+1+1, and 0+0+1+0+1, and 1+1+0+0+0, and so on. Think of it like throwing 2 dices. The probability of getting 2 (1+1) or 12 (6+6) is lower than the probability of getting 7 (1+6,2+5,3+4,4+3,5+2,6+1) (settlers of catan ftw. ;)). – Progman Jan 01 '12 at 11:23
  • Your second line resets `r` each time. You should initialize it to `a` and then update it in terms of itself in the loop. – Matthew Strawbridge Jan 01 '12 at 11:25

7 Answers7

19

Others have explained why your solution doesn't work. Here's the correct solution:

1) Find the smallest number, p, such that 2^p > b-a.

2) Perform the following algorithm:

r=0
for i = 1 to p
    r = 2*r + Random(0,1)

3) If r is greater than b-a, go to step 2.

4) Your result is r+a

So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:

00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.

So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.

Note that if you have to do this a lot, two optimizations are handy:

1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.

2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • So the idea here is to find the maximum number of bits one would need for representing the highest number, and then just flipping the coin for each of those bits. – Christian Neverdal Jan 01 '12 at 12:11
  • Yes. And then if it's out of range, repeat the process. – David Schwartz Jan 01 '12 at 12:13
  • Came to something similar myself but rather in a form of binary search. Thanks, nice to see it was okay. Was hoping though for a clever way to reduce the problem from power of 2 to any number (and better O-big). – gorlum0 Feb 07 '12 at 11:29
2

No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.

Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.

An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high. Details and complexity left to you, since it's homework.

That should create (approximately) a uniform distribution.

Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • Here is a quote from the book: We shall assume that we have at our disposal a random-number generator RANDOM. A call to RANDOM(a, b) returns an integer between a and b, inclusive, with each such integer being equally likely. For example, RANDOM(0, 1) produces 0 with probability 1/2, and it produces 1 with probability 1/2 – Ivaylo Strandjev Jan 01 '12 at 11:27
  • Aha. Right, would have been too easy otherwise. I forgot the first line when I saw the homework tag :) – Daniel Fischer Jan 01 '12 at 11:31
  • @DanielFischer That will only produce an approximately uniform distribution. Consider (1,3). – David Schwartz Jan 01 '12 at 11:45
  • Yup, perfect only for powers of 2. But not too crappy for a spontaneous idea. – Daniel Fischer Jan 01 '12 at 11:47
1

No, your solution isn't correct. This sum'll have binomial distribution.

However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.

repeat
  result = a
  steps = ceiling(log(b - a))

  for i = 0 to steps
    result += (2 ^ i) * Random(0, 1)
until result <= b

KennyTM: my bad.

Arenielle
  • 2,737
  • 1
  • 15
  • 7
1

I read the other answers. For fun, here is another way to find the random number:

Allocate an array with b-a elements. Set all the values to 1. Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.

Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)

This would have O(infinity) ... :) On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).

Christian Neverdal
  • 5,655
  • 6
  • 38
  • 93
  • 1
    You can actually make this work reasonably and if you do it right, it's O(n log n) (where n=b-a). Basically, you want to increment each array element if Random(0,1) is 1, you consider each element in the running if it's value is less than a cutoff. You increment the cutoff after each pass and if no values are valid, you decrement the cutoff. When one entry is valid, you're done. – David Schwartz Jan 01 '12 at 12:20
0

First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step. Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.

The running time you estimate is correct.

Ivaylo Strandjev
  • 69,226
  • 18
  • 123
  • 176
-1

Your solution's pseudocode should look like:

r=a
for i = 0 to b-a
    r+=Random(0,1)
return r

As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.

So for a=1, b=5, there are 5 choices made.

The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%

The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%

As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.

-1

In the algorithm you created, it is really not equally distributed.

The result "r" will always be either "a" or "a+1". It will never go beyond that.

It should look something like this:

r=0;
for i=0 to b-a
   r = a + r + Random(0,1)

return r;

By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.

Gopal Nair
  • 830
  • 4
  • 7