0

I'm having a bit of an issue with transitions within a Markov chain when the conditional probabilities that describe the transitions have more than one significant digit. For example, if the conditional probabilities are:

eps_a <- 0.3 #    Pr(a leaves)
eps_b <- 0.4 #    Pr(b leaves)
gam_c <- 0.2 #    Pr(c arrives)
eps_b_a <- 0.4 #  Pr(b leaves | a)
gam_c_a <- 0.9 #  Pr(c arrives | a)
eps_a_b <- 0.2 #  Pr(a leaves | b)
gam_c_b <- 0.9 #  Pr(c arrives | b)
gam_c_ab <- 0.7 # Pr(c arrives | ab)

And the transitions between 8 different states are:

tm <- rep(0,8)
# Assuming AB was state in last time step
tm[1] <- eps_a * eps_b * (1 - gam_c) #----------------------|U
tm[2] <- (1 - eps_a) * eps_b_a * (1 - gam_c_a) #------------|A
tm[3] <- eps_a_b * (1 - eps_b) * (1 - gam_c_b) #------------|B
tm[4] <- eps_a * eps_b * gam_c #----------------------------|C
tm[5] <- (1 - eps_a_b) * (1 - eps_b_a) * (1 - gam_c_ab) #---|AB
tm[6] <- (1 - eps_a) * eps_b_a * gam_c_a #------------------|AC
tm[7] <- eps_a_b * (1 - eps_b) * gam_c_b #------------------|BC
tm[8] <- (1 - eps_a_b) * (1 - eps_b_a) * gam_c_ab #---------|ABC

Then tm sums to 1.

sum(tm)
[1] 1

However, if the conditional probabilities have more significant digits (even 1 more), then the transition probabilities no longer sum to 1.

eps_a <-     0.3216542
eps_b <-     0.4123442
gam_c <-     0.2145621
eps_b_a <-   0.4231564
gam_c_a <-   0.9943285
eps_a_b <-   0.2321542
gam_c_b <-   0.9978964
gam_c_ab <-  0.7777662

When using these conditional probabilities the transitions in tm now sum to 0.9990323. Using other numbers I've seen it range between 0.97 and 1.08. I thought at that it could be an underflow issue, but summing the logs of these probabilities does not fix this either.

# Using the probabilities with more digits above.
tm_log <- rep(0,8)
tm_log[1] <- log(eps_a) + log(eps_b) + log((1 - gam_c)) #------------------|U
tm_log[2] <- log(1 - eps_a) + log(eps_b_a) + log(1 - gam_c_a) #------------|A
tm_log[3] <- log(eps_a_b) + log(1 - eps_b) + log(1 - gam_c_b) #------------|B
tm_log[4] <- log(eps_a) + log(eps_b) + log(gam_c) #------------------------|C
tm_log[5] <- log(1 - eps_a_b) + log(1 - eps_b_a) + log(1 - gam_c_ab) #-----|AB
tm_log[6] <- log(1 - eps_a) + log(eps_b_a) + log(gam_c_a) #----------------|AC
tm_log[7] <- log(eps_a_b) + log(1 - eps_b) + log(gam_c_b) #----------------|BC
tm_log[8] <- log(1 - eps_a_b) + log(1 - eps_b_a) + log(gam_c_ab) #---------|ABC
sum(exp(tm_log))
[1] 0.9990323

This leads me to believe that I have either coded up the transition probabilities incorrectly or there is some other issue that I do not know about. I could always calculate one of the transitions as 1 minus the sum of all the others, is that the only way around this?

mfidino
  • 3,030
  • 1
  • 9
  • 13
  • You have an error somewhere in the equations. The sum is equal to 1 only when `eps_b_a == eps_b` or `eps_a_b == eps_a`. – Marat Talipov May 04 '18 at 17:33
  • Interesting! I think the issue may be that I am using the conditional probabilities if the object from the previous time-step continues to be present in the current time-step. For example, AB to B does not condition on object B being present while it does condition on A. Instead, I should (in this example) always condition on AB (i.e., only use `eps_a_b`, `eps_b_a`, and `gam_c_ab`). – mfidino May 04 '18 at 18:16

0 Answers0