0

If we are given a multinomial distribution

p=[0.2,0.4,0.1,0.3]

and we have to sample from this distribution over a number of times and return the result, how do I write the algorithm for this?

Eg - if I have a fair die and I want to roll it 20 time and get the total number of times that it landed on which side,

[4, 1, 7, 5, 2, 1]

this should be the result(randomized) -> It landed 4 times on 1, once on 2, etc. There is a function to do this in Numpy in numpy we can use

numpy.random.multinomial()
>>> np.random.multinomial(20, [1/6.]*6, size=1)
array([[4, 1, 7, 5, 2, 1]]) # random

I want to understand how the algorithm is written for performing this action I've tried this approach in python ->

import numpy as np
import random
probs = [0.2, 0.4, 0.1, 0.3]

def sample(count:int)->list:
    output = [0,0,0,0]
    for i in range(count):
        num = random.random()
        if(0 < num <= 0.15):
            output[2]+=1
        elif(0.15 < num <= 0.25):
            output[0]+=1
        elif(0.25 < num <= 0.35):
            output[3]+=1
        elif(0.35 < num <= 0.45):
            output[1]+=1
        
    return output

final_output = sample(10)
print(final_output)

np.random.multinomial(10, probs, size=1)

But I don't think this is the optimal way, maybe I'm lacking some concepts in Probability?

The actual Code written in Numpy in CPython: Link to the Numpy file where the code for numpy.random.multinomial() is written starting from line 4176

Possible Duplicate: How to sample from a multinomial distribution?

References: https://numpy.org/doc/stable/reference/random/generated/numpy.random.multinomial.html

Random number generation from multinomial distribution in R using rmultinom() function

scarecrow
  • 57
  • 2
  • 8
  • FYI: If you dig deeper into the numpy code, you eventually end up at the *actual* implementation (in C): https://github.com/numpy/numpy/blob/1995e2cdfd17ec75e37cc6e69a42ce9000aac966/numpy/random/src/distributions/distributions.c#L1672-L1689 – Warren Weckesser Feb 24 '22 at 16:40
  • Whoa, that's awesome, but it's just calling more of the binomial functions which in turn are calling other functions, I don't think I can make sense of that with all the arguments being passed around. Thanks though, I'll try to understand it! – scarecrow Feb 24 '22 at 17:10
  • That implementation of `multinomial` is iteratively using `binomial` to generate the components of the multinomial sample. – Warren Weckesser Feb 24 '22 at 17:12
  • I'm not able to understand clearly what they are trying to do in the `random_binomial_inversion` function. Seems like variable binomial of type `binomial_t *` has a bunch of attributes via which they are doing a bunch of calculations like `exp`, `sqrt`? Sorry if I'm too slow to catch up on – scarecrow Feb 24 '22 at 17:21

1 Answers1

0

If you care about sampling from this distribution multiple times, then it is worth looking at the Aliasing method - https://en.wikipedia.org/wiki/Alias_method#:~:text=In%20computing%2C%20the%20alias%20method,arbitrary%20probability%20distribution%20pi.

You can sample in $O(1)$ time after an initial computation of $O(K\log K)$ where $K$ is the size of the support of the distribution.

Black Jack 21
  • 315
  • 4
  • 19