4

I understand that strictly on concept, they are different. But in a single trial (or experiment) for numpy.random.multinomial, is it sampling the same way as numpy.random.choice though giving a different view of the output?

For example:

>> np.random.choice(6, size=6, replace=True, p=[1/6.]*6)
>> array([2, 0, 4, 2, 5, 4])

Output gives the identity of what was picked in the array [0,1,2,3,4,5]

and

>> np.random.multinomial(1, [1/6.]*6, size=6)
>> array([[0, 0, 1, 0, 0, 0],
          [0, 0, 0, 0, 0, 1],
          [0, 0, 0, 1, 0, 0],
          [0, 0, 0, 1, 0, 0],
          [0, 0, 0, 0, 1, 0],
          [1, 0, 0, 0, 0, 0]])

Output gives the number of times each choice was picked, but since it was limited to 1 trial, it can also be summarized as [2,5,3,3,4,1] from choices [0,1,2,3,4,5]

kentwait
  • 1,969
  • 2
  • 21
  • 42

2 Answers2

5

Yes, they are effectively the same.

Robert Kern
  • 13,118
  • 3
  • 35
  • 32
1

Yes, but multinomial is faster.

$ python -m timeit 'import numpy as np' 'np.nonzero(np.random.multinomial(1, [1/6.]*6, size=6))[1]'
50000 loops, best of 5: 4.84 usec per loop
$ python -m timeit 'import numpy as np' 'np.random.choice(6, size=6, replace=True, p=[1/6.]*6)'
20000 loops, best of 5: 15.9 usec per loop
  • Depends on the size, for a very large probability distribution (>10k entries) `choice` was faster for me. – Manux Jan 30 '23 at 23:03