Your process can be described as a Markov chain with a finite number of states, and an absorbing state.
The number of steps before reaching the absorbing state is called the hitting time. The expected hitting time can be calculated easily from the transition matrix of the Markov chain.
Enumerate all possible states (a, b, c, d, e, f). Consider only a finite number of states, because "b >= 3" is effectively the same as "b = 3", etc. The total number of states is (1+1)*(3+1)*(0+1)*(2+1)*(3+1) = 192
.
Make sure that in your enumeration, starting state (0, 0, 0, 0, 0, 0) comes first, with index 0, and absorbing state (1, 3, 0, 1, 2, 3) comes last.
Build the transition matrix P
. It's a square matrix with one row and column per state. Entry P[i, j]
in the matrix gives the probability of going from state i
to state j
when rolling a die. There should be at most 6 non-zero entries per row.
For example, if i
is the index of state (1, 0, 0, 1, 2, 2)
and j
the index of state (1, 1, 0, 1, 2, 2)
, then P[i, j]
= probability of rolling face B = 0.05. Another example: if i
is the index of state (1,3,0,0,0,0)
, then P[i,i]
= probability of rolling A, B or C = 0.25+0.05+0.2 = 0.5.
Call Q
the square matrix obtained by removing the last row and last column of P
.
Call I
the identity matrix of the same dimensions as Q
.
Compute matrix M = (I - Q)^-1
, where ^-1
is matrix inversion.
In matrix M
, the entry M[i, j]
is the expected number of times that state j
will be reached before the absorbing state, when starting from state i
.
Since our experiment starts in state 0, we're particularly interested in row 0 of matrix M.
The sum of row 0 of matrix M is the expected total number of states reached before the absorbing state. That is exactly the answer we seek: the number of steps to reach the absorbing state.
To understand why this works, you should read a course on Markov chains! Perhaps this one: James Norris' course notes on Markov chains. The chapter about "hitting times" (which is the name for the number of steps before reaching target state) is chapter 1.3.
Below, an implementation in python.
from itertools import product, accumulate
from operator import mul
from math import prod
import numpy as np
dice_weights = [0.25, 0.05, 0.2, 0.15, 0.2, 0.15]
targets = [1, 3, 0, 1, 2, 3]
def get_expected_n_trials(targets, dice_weights):
states = list(product(*(range(n+1) for n in targets)))
base = list(accumulate([n+1 for n in targets[:0:-1]], mul, initial=1))[::-1]
lookup = dict(map(reversed, enumerate(states)))
P = np.zeros((len(states), len(states)))
for i, s in enumerate(states):
a,b,c,d,e,f = s
for f, p in enumerate(dice_weights):
#j = index of state reached from state i when rolling face f
j = i + base[f] * (s[f] < targets[f])
j1 = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]
if (j != j1):
print(i, s, f, ' --> ' , j, j1)
assert(j == j1)
P[i,j] += p
Q = P[:-1, :-1]
I = np.identity(len(states)-1)
M = np.linalg.inv(I - Q)
return M[0,:].sum()
print(get_expected_n_trials(targets, dice_weights))
# 61.28361802372382
Explanations of code:
- First we build the list of states using Cartesian product
itertools.product
- For a given state
i
and die face f
, we need to calculate j
= state reached from i when adding f. I have two ways of calculating that, either as j = i + base[f] * (s[f] < targets[f])
or as j = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]
. Because I'm paranoid, I calculated it both ways and checked that the two ways gave the same result. But you only need one way. You can remove lines j1 = ...
to assert(j == j1)
if you want.
- Matrix
P
begins filled with zeroes, and we fill up to six cells per row with P[i, j] += p
where p
is probability of rolling face f
.
- Then we compute matrices
Q
and M
as I indicated above.
- We return the sum of all the cells on the first row of
M
.
To help you better understand what is going on, I encourage you to examine the values of all variables. For instance you could replace return M[0, :].sum()
with return states, base, lookup, P, Q, I, M
and then write states, base, lookup, P, Q, I, M = get_expected_n_trials(targets, dice_weights)
in the python interactive shell, so that you can look at the variables individually.