3

I want to numerically evaluate the transition probability of a linear Birth and Death process

where is the binomial coefficient and

I am able to evaluate it with an acceptable numerical error (using logarithms and the Kahan-Neumaier summation algorithm) for most parameter combinations.

Problems arise when addends alternate in sign and numerical error dominates the sum (condition number tends to infinity in this case). This happens when

For example, I have problems evaluating p(1000, 2158, 72.78045, 0.02, 0.01). It should be 0 but I get the very big value log(p) ≈ 99.05811, which is impossible for a probability.

I tried refactoring the sum in many different ways and using various "precise" summation algorithms such as Zhu-Hayes. I always get approximately the same wrong value, making me think that the problem is not the way I sum the numbers but the inner representation of each addend.

Because of binomial coefficients, values easily overflow. I tried with a linear transformation in order to keep each (absolute) element in the sum between the lowest normal number and 1. It didn't help and I think it is because of many algebraic operations of similar magnitudes.

I am now at a dead end and don't know how to proceed. I could use arbitrary precision arithmentic libraries but the computational cost is too high for my Markov Chain Monte Carlo application.

Is there a proper way or tricks to evaluate such sums when we cannot store partial sums at a good-enough precision in a IEEE-754 double?

Here is a basic working example where I only rescale the values by the maximum and sum with Kahan summation algorithm. Obviously, most values end up being subnormals with a Float64.

# this is the logarithm of the absolute value of element h
@inline function log_addend(a, b, h, lα, lβ, lγ)
  log(a) + lgamma(a + b - h) - lgamma(h + 1) - lgamma(a - h + 1) -
  lgamma(b - h + 1) + (a - h) * lα + (b - h) * lβ + h * lγ
end

# this is the logarithm of the ratio between (absolute) elements i and j
@inline function log_ratio(a, b, i, j, q)
  lgamma(j + 1) + lgamma(a - j + 1) + lgamma(b - j + 1) + lgamma(a + b - i) -
  lgamma(i + 1) - lgamma(a - i + 1) - lgamma(b - i + 1) - lgamma(a + b - j) +
  (j - i) * q
end

# function designed to handle the case of an alternating series with λ > μ > 0
function log_trans_prob(a, b, t, λ, μ)
  n = a + b
  k = min(a, b)

  ω = μ / λ
  η = exp((μ - λ) * t)

  if b > zero(b)
    lβ = log1p(-η) - log1p(-ω * η)
    lα = log(μ) + lβ - log(λ)
    lγ = log(ω - η) - log1p(-ω * η)
    q = lα + lβ - lγ

    # find the index of the maximum addend in the sum
    # use a numerically stable method for solving quadratic equations
    x = exp(q)
    y = 2 * x / (1 + x) - n
    z = ((b - x) * a - x * b) / (1 + x)

    sup = if y < zero(y)
      ceil(typeof(a), 2 * z / (-y + sqrt(y^2 - 4 * z)))
    else
      ceil(typeof(a), (-y - sqrt(y^2 - 4 * z)) / 2)
    end

    # Kahan summation algorithm
    val = zero(t)
    tot = zero(t)
    err = zero(t)
    res = zero(t)
    for h in 0:k
      # the problem happens here when we call the `exp` function
      # My guess is that log_ratio is either very big or very small and its
      # `exp` cannot be properly represented by Float64
      val = (-one(t))^h * exp(log_ratio(a, b, h, sup, q))
      tot = res + val
      # Neumaier modification
      err += (abs(res) >= abs(val)) ? ((res - tot) + val) : ((val - tot) + res)
      res = tot
    end

    res += err

    if res < zero(res)
      # sum cannot be negative (they are probabilities), it might be because of
      # rounding errors
      res = zero(res)
    end

    log_addend(a, b, sup, lα, lβ, lγ) + log(res)
  else
    a * (log(μ) + log1p(-η) - log(λ) - log1p(-ω * η))
  end
end

# ≈ 99.05810564477483 => impossible
log_trans_prob(1000, 2158, 72.78045, 0.02, 0.01)

# increasing precision helps but it is too slow for applications
log_trans_prob(BigInt(1000), BigInt(2158), BigFloat(72.78045), BigFloat(0.02),
               BigFloat(0.01))
Tasos Papastylianou
  • 21,371
  • 2
  • 28
  • 57
  • I'm pretty sure your use of `log1p()` is misguided. If you want to avoid loss of precision due to the lack of a fixed point of `log()` and `exp()` at zero, you're too late, since `exp((μ - λ) * t)` is already bleeding bits when the argument is near zero. Use `expm1((μ - λ) * t)` instead. – EOF Aug 09 '18 at 17:32
  • I don't know if it is relevant but https://www.math.upenn.edu/~wilf/AeqB.html may be of interest. – dmuir Aug 13 '18 at 11:32
  • @dmuir Thank you very much! I didn't know about the book. It will need me some time to digest it but it is definitely relevant. If not, it is still a very interesting book. Thanks again. – Alberto Pessia Aug 15 '18 at 10:53
  • @dmuir I recently published an arXiv paper ( https://arxiv.org/abs/1909.10765 ) answering this question. The manuscript wouldn't exist if you didn't point me to "A = B" book. I would like to thank you in the acknowledgments. If you agree, send me an email (you can find it in my manuscript) with your real name :) – Alberto Pessia Sep 30 '19 at 08:29

1 Answers1

1

I finally solved the problem and wrote a paper describing the solution in detail: https://arxiv.org/abs/1909.10765

Briefly, divide and multiply each addend by the first term in the sum to obtain

p(a, b, t, λ, μ) = ω(a, b, t, λ, μ) 2F1(-a, -b; -(a + b - 1); -z(t, λ, μ))

where ω(a, b, t, λ, μ) is the first term in the series and 2F1 is the Gaussian hypergeometric function. Hypergeometric function 2F1(-a, -b; -(a + b - k); -z) (a and b positive integers, k <= 1, z real number) can be computed with the following three-term recurrence relation (TTRR):

u(a, b, k) y(b + 1) + v(a, b, k, z) y(b) + w(b, k, z) y(b - 1) = 0

where

u(a, b, k) = (a + b + 1 − k) (a + b − k)

v(a, b, k, z) = − (a + b − k) (a + b + 1 − k + (a − b) z)

w(b, k, z) = − b (b − k) z

If b > a swap the two variables (that is a' = max(a, b) and b' = min(a, b)).

Compute the recurrence in a forward manner starting with values y(0) = 2F1(-a, 0; -(a - k); -z) = 1 and y(1) = 2F1(-a, -1; -(a + 1 - k); -z) = 1 + (a z) / (a + 1 - k).

I implemented the previous algorithm in the Julia package SimpleBirthDeathProcess.