Lets assume that I have a set of probabilities [0.1, 0.6, 0.2, 0.1]. I want to sample the locations from this set of probabilities. e.g. when I sample I should get the location 1 quite often than other locations. I know I can implement this in Matlab (using the command mnrnd) or other languages. However, I would like to know the algorithmic details. I would like to know a very simple algorithm which can be used to sample from a multinomial distribution.
-
1What value of `n` do you want to use? Not enough information given – Severin Pappadeux Jan 13 '20 at 22:39
1 Answers
Brute force approach
Create an array containing the cumulative probabilities, in your case cdf = [0.1, 0.7, 0.9, 1.0]
. Generate U
, a uniform(0,1) random value. Select the first index such that cdf[i] <= U
. For a small number of outcomes this could be accomplished with a linear search (O(n)), or use binary search (O(log n)) if the number of outcomes is large.
Alias method
An alias table requires you to use conditional probability to construct a table of primary elements and alias values in such a way that for each primary/alias pair, the total probability is identical. You then use one random number to choose a column within the table (with equal probability), and a second value to make a binomial choice between the primary and the alias. Run time is O(1) once the table has been constructed, which takes O(n) effort. See Wikipedia for details, or rubygems for a Ruby implementation. Note that this requires two uniforms per outcome, so it's not an inversion and you can't do fun tricks like generating antithetic variates.

- 18,696
- 4
- 27
- 56
-
1But OT asked for multinomial, not just some generic discrete finite probability distribution. Not sure if your answer could help... – Severin Pappadeux Jan 13 '20 at 22:40
-