Preventing underflows when computing log of the probability that a normal sample falls in a certain interval in python

Question

Given two reals x and y, I want to compute the following function in python:

log Pr [ x <= t <= y ],

where t is sampled from a normal distribution.

One naive implementation is to use scipy.stats.norm.

np.log(scipy.stats.norm.cdf(y) - scipy.stats.norm.cdf(x))

Unfortunately, this causes an underflow when x and y are far from 0. How can one prevent such a numerical error?

Simon Gibbons · Answer 1 · 2016-05-17T10:31:37.470

This problem is much more stable if done in logspace.

The trick is to use scipy.stats.norm.logcdf for values less than zero and scipy.stats.norm.logsf for values larger than zero.

This combined with a stable algorithm for calculating log(exp(y) - exp(x)) gives reasonable results

import numpy as np
from scipy.stats import norm

def log_subtract(x, y):
    return x + np.log1p(-np.exp(y-x))

def lnprob(x, y):
    if x < 0:
        return log_subtract(norm.logcdf(y), norm.logcdf(x))
    else:
        return log_subtract(norm.logsf(x), norm.logsf(y))

Preventing underflows when computing log of the probability that a normal sample falls in a certain interval in python

1 Answers1