is there an more efficient way to enumerate probability for each of possible outcome of a discrete random variable in python or R?

Question

I am computing the pmf theoretically in Python. here is the code.

>>> a_coin = np.array([0,1])
>>> three_coins = np.array(np.meshgrid(a_coin,a_coin,a_coin)).T.reshape(-1,3)
>>> heads = np.sum(three_coins, axis = 1)
>>> df = pd.DataFrame({'heads': heads, 'prob': 1/8})
>>> np.array(df.groupby('heads').sum()['prob'])
array([0.125, 0.375, 0.375, 0.125])

this piece of code is simulating 1 toss of 3 fair coins. the possible outcomes is {0,1,2,3}. last line of code compute the probability for each of the possible outcomes respectively.

I have to put 10 'a_coin' in np.meshgrid(a_coin,...,a_coin) if i want to compute the pmf for tossing 10 fair coins, which seems to be boring and inefficient.

the question is, is there an more efficient way to do this in python or R?

Use math formulas? Binomial coeffs etc... – Julien Mar 13 '19 at 05:01 — Julien, Mar 13 '19 at 05:01

IRTFM · Accepted Answer · 2019-03-16T01:07:39.687

Here's how to do it in R:

> sapply(0:3, choose, n=3)/sum(sapply(0:3, choose, n=3))
[1] 0.125 0.375 0.375 0.125

The choose function gives you the binomial coefficients. To turn them into probabilities just divide by their summs:

sapply(0:10, choose, n=10)
 [1]   1  10  45 120 210 252 210 120  45  10   1

sapply(0:10, choose, n=10)/ sum( sapply(0:10, choose, n=10))
 [1] 0.0009765625 0.0097656250 0.0439453125 0.1171875000 0.2050781250 0.2460937500 0.2050781250
 [8] 0.1171875000 0.0439453125 0.0097656250 0.0009765625

It did not appear that you really wanted to enumerate so much as calculate. If you need to enumerate outcomes from 10 successive "fair" binomial draws, then you could use combn 11 times.

Paul Panzer · Answer 2 · 2019-03-13T08:12:33.127

Here is an fft based numpy solution:

import numpy as np
from scipy import fftpack

def toss(n=10, p=0.5):
    t1 = np.zeros(fftpack.next_fast_len(n+1))
    t1[:2] = 1-p, p
    f1 = fftpack.rfft(t1)
    c1 = f1[1:(len(t1) - 1) // 2 * 2 + 1].view(f'c{2*t1.itemsize}')
    c1 **= n
    f1[::(len(t1) + 1) // 2 * 2 - 1] **= n
    return fftpack.irfft(f1)[:n+1]

For example:

>>> toss(3)
array([0.125, 0.375, 0.375, 0.125])
>>> toss(10)
array([0.00097656, 0.00976562, 0.04394531, 0.1171875 , 0.20507813,
       0.24609375, 0.20507813, 0.1171875 , 0.04394531, 0.00976562,
       0.00097656])

score 0 · Answer 3 · answered Mar 13 '19 at 07:30

Using Python standard libraries you can get probabilities as rational numbers (this is exact solution), e.g.

from fractions import Fraction
from math import factorial

n=30
[Fraction(factorial(n), factorial(n - j)) * Fraction(1, factorial(j) * 2 ** n) for j in range(0, n + 1)]

This could be easily converted to floats, e.g.

list(map(float, [Fraction(factorial(n), factorial(n - j)) * Fraction(1, factorial(j) * 2 ** n) for j in range(0, n + 1)]))

is there an more efficient way to enumerate probability for each of possible outcome of a discrete random variable in python or R?

3 Answers3