3

I want to calculate the probability of the event that the sum of all eyes of n dice with s sides (numbered from 1 to s) is equal to t. My language is Python 3.

My current approach is pretty much a try-and-count solution and only works for small numbers (running probability(10, 10, 50) already ate all my RAM and forced me to hard reset):

import itertools
def probability(n, s, t):
    all_rolls=list(itertools.product(range(1,s+1), repeat=n))
    target_rolls=[l for l in all_rolls if sum(l)==t]
    return round(len(target_rolls)/len(all_rolls), 4)

But I honestly don't know how else to solve this. Can you please help me to get on the right track?

Byte Commander
  • 6,506
  • 6
  • 44
  • 71

3 Answers3

2

itertools.product is too slow for big number of sides > 5 and number of sides > 6. On my machine having dice_number: 10 and sides: 10 took one and a half hours to calculate. Instead I used numpy.polypow function to calculate targets and it took less than a second to calculate.

from numpy.polynomial.polynomial import polypow

def probability(dice_number, sides, target):
    """
    Using numpy polynomial
    The number of ways to obtain x as a sum of n s-sided dice
    is given by the coefficients of the polynomial:

    f(x) = (x + x^2 + ... + x^s)^n
    """

    # power series (note that the power series starts from x^1, therefore
    # the first coefficient is zero)
    powers = [0] + [1] * sides
    # f(x) polynomial, computed used polypow in numpy
    poly = polypow(powers, dice_number)
    return poly[target] / sides ** dice_number if target < len(poly) else 0
Vlad Bezden
  • 83,883
  • 25
  • 248
  • 179
1

first off: the total possible roll combinations will always be s**n, so you don't need to store a list of all possibilities in order to get it's length. Similarly you can just keep a running total of desired outcomes instead of keeping a list of them to save on memory space but it still won't speed up the function a whole lot:

def probability(n, s, t):
    all_rolls = itertools.product(range(1,s+1), repeat=n) #no list, leave it a generator
    target_rolls=sum(1 for l in all_rolls if sum(l)==t) #just total them up
    return round(target_rolls/s**n, 4)

A much more efficient way of calculating the possibilities is with a dict and some clever iteration. Each dictionary will use the roll value as keys and frequency as value, each iteration the prev will be this dict for the previous X dice and cur will be updated from it with the addition of another die:

import collections
def probability(n, s, t):
    prev = {0:1} #previous roll is 0 for first time
    for _ in range(n):
        cur = collections.defaultdict(int) #current probability
        for r,times in prev.items():
            for i in range(1,s+1):
                #if r occured `times` times in the last iteration then
                #r+i have `times` more possibilities for the current iteration.
                cur[r+i]+=times
        prev = cur #use this for the next iteration

    return cur[t] / s**n
    #return round(cur[t] / s**n , 4)

note 1: since cur is a defaultdict trying to look up a number that is not not possible with the given input will return 0

note 2: since this method puts together a dictionary with all possible outcomes you can return cur and do the calculation for multiple different possible outcomes on the same dice rolls.

Tadhg McDonald-Jensen
  • 20,699
  • 5
  • 35
  • 59
1

Stop making lists. Just use lazy evaluation.

from itertools import product

def prob(dice, pips, target):
    rolls = product(range(1, pips+1), repeat=dice)
    targets = sum(1 for roll in rolls if sum(roll) == target)
    return targets / pips**dice

Tests:

for i in range(5, 26):
    print(i, prob(5, 5, i))
print('sum: ', sum(prob(5, 5, i) for i in range(5, 26)))
# prints
5 0.00032
6 0.0016
7 0.0048
8 0.0112
9 0.0224
10 0.03872
11 0.0592
12 0.0816
13 0.1024
14 0.1168
15 0.12192  # symmetrical, with peak in middle
16 0.1168
17 0.1024
18 0.0816
19 0.0592
20 0.03872
21 0.0224
22 0.0112
23 0.0048
24 0.0016
25 0.00032
sum:  1.0000000000000002

edit: removed unused def

Terry Jan Reedy
  • 18,414
  • 3
  • 40
  • 52
  • you don't use `tsum` inner function? – Tadhg McDonald-Jensen Mar 21 '16 at 16:37
  • 1
    Whoops. I was going to, but decided `if` in generator expression would be faster. Removed. – Terry Jan Reedy Mar 21 '16 at 16:41
  • 1
    prob(8, 8, 40) took about 3 seconds. I leave running prob(10, 10, 50) to you. Probabilities could be calculated much faster by formulas rather than brute-force enumeration, but I presume that the answer is not the point of the exercise. – Terry Jan Reedy Mar 21 '16 at 16:43
  • Your improvement is nice as I can now run the function for big numbers without crashing my machine because of a memory overflow, but it still takes forever to calculate big numbers. Therefore I'm accepting the other answer. – Byte Commander Mar 22 '16 at 07:56