Python: Do pandas.DataFrame.comprod() and numpy.comprod() handle numerical underflow?

Question

Specifically, are these cumulative product functions in pandas and numpy implemented in a robust way to handle underflow when multiplying lots of small numbers together? For example, are they using the log-sum-exp trick?

Thanks.

You can check this pretty easily. For example, set `x = np.array([1e-5, 1e-30, 1e-100, 1e-200, 1e50, 1e150])`, and compare `np.cumprod(x)` with `np.exp(np.cumsum(np.log(x)))`. — Warren Weckesser, Apr 20 '17 at 01:08
Yes I did something similar but wasn't sure where the theoretical bounds are, or if it was just hitting the limit of my platoform (machine/os/etc). — WillZ, Apr 21 '17 at 10:31

score 1 · Accepted Answer · edited Apr 21 '17 at 12:24

Unfortunately, no. @warren-weckesser 's comment shows this to not work.

np.array([1e-5, 1e-30, 1e-100, 1e-200, 1e50, 1e150]).cumprod()

# returns
array([1.0e-005, 1.0e-035, 1.0e-135, 0.0e+000, 0.0e+000, 0.0e+000])

The reason is that numpy floats support a smallest positive value of 2**-1022, or about 2.225e-308. Once your calculation becomes smaller than that, it is dropped to zero, which is what we see in the above output. The same is true for pandas.

Python: Do pandas.DataFrame.comprod() and numpy.comprod() handle numerical underflow?

1 Answers1