2

Lets assume that I have a set of probabilities [0.1, 0.6, 0.2, 0.1]. I want to sample the locations from this set of probabilities. e.g. when I sample I should get the location 1 quite often than other locations. I know I can implement this in Matlab (using the command mnrnd) or other languages. However, I would like to know the algorithmic details. I would like to know a very simple algorithm which can be used to sample from a multinomial distribution.

user570593
  • 3,420
  • 12
  • 56
  • 91

1 Answers1

4

Brute force approach

Create an array containing the cumulative probabilities, in your case cdf = [0.1, 0.7, 0.9, 1.0]. Generate U, a uniform(0,1) random value. Select the first index such that cdf[i] <= U. For a small number of outcomes this could be accomplished with a linear search (O(n)), or use binary search (O(log n)) if the number of outcomes is large.

Alias method

An alias table requires you to use conditional probability to construct a table of primary elements and alias values in such a way that for each primary/alias pair, the total probability is identical. You then use one random number to choose a column within the table (with equal probability), and a second value to make a binomial choice between the primary and the alias. Run time is O(1) once the table has been constructed, which takes O(n) effort. See Wikipedia for details, or rubygems for a Ruby implementation. Note that this requires two uniforms per outcome, so it's not an inversion and you can't do fun tricks like generating antithetic variates.

pjs
  • 18,696
  • 4
  • 27
  • 56