Do numerical programming languages distinguish between a "largest finite number" and "infinity"?

Question

Question motivation:

In standard numerical languages of which I am aware (e.g. Matlab, Python numpy, etc.), if, for example, you take the exponential of a modestly large number, the output is infinity as the result of numerical overflow. If this is multiplied by 0, you get NaN. Separately, these steps are reasonable enough, but they reveal a logical mistake in the implementation of the math. The first number resulting from overflow is known to be finite and we clearly want the result of a multiplication by 0 with this large finite number to be 0.

Explicitly:

>>> import numpy as np
>>> np.exp(709)*0
0.0
>>> np.exp(710)*0
nan

I imagine we could here introduce a notion of the "largest finite value" (LFV) which would have the following properties:

LFV would be the default for numerical overflow that would otherwise round up to infinity
LFV < infinity
any explicit number < LFV (for example if LEV stands for "largest explicit value", then LEV
(MATLAB detail: realmax < LFV)
LFV*0 = 0

On the other hand, infinity should not simply be redefined in the way described for LFV. It does not make sense for 0 *infinity = 0...appropriately, the current standard implementation of infinity yields NaN in this setting. Also, sometimes there arises a need to initialize numbers to infinity and you'd want the result of any numerical operation, even one that yielded LFV to be strictly less than the initialized value (this is convenient for some logical statements). I'm sure other situations exist where a proper infinity is necessary -- my point is simply that infinity should not simply be redefined to have some of the LFV properties above.

The Question:

I want to know if there is any language which uses a scheme like this and if there are any problems with such a scheme. This problem does not arise in proper math since there aren't these numerical limits on the size of numbers, but I think it is a real problem when implementing a consistent mathematics in programming languages. Essentially, by LFV, I think I want a shorthand for the open interval between the largest explicit value and infinity LFV = (LEV,infinity), but maybe this intuition is wrong.

Update: In the comments, people seem to be objecting a little to the utility of issue I'm bringing up. My question arises not because many related issues occur, but rather because the same issue frequently arises in many different settings. From talking to people who do data analysis, this is something that often enough contributes to runtime errors when training/fitting models. The question is basically why this isn't handled by numerical languages. From the comments, I'm essentially gathering that the people who write the languages don't see the utility of handling things this way. In my opinion, when certain specific issues occur frequently enough for people using a language, it might make sense to handle those exceptions in a principled way so each user doesn't have to.

It doesn't seem to solve much. You still have problems like `LFV - LFV == NaN`, and if you put this in, you know people are going to want an SPV (smallest positive value), and then `SPV * LFV == NaN` again. — user2357112, Apr 25 '15 at 00:21
@user2357112 I think you make a good observation. Though wouldn't this still provide useful information in some regimes (i.e. handle things that are currently handled poorly in a better way)? Admittedly, this wouldn't solve all of the problems related to bounded value precision. I don't think I'd object to also having a SPV. — Josh, Apr 25 '15 at 00:29
If you're just asking whether any of the major numerical languages/libraries work this way, I think you already know the answer: no. If you're asking whether there's some obscure niche library that does so, there might be, but who cares? I think what you really want to know is why they don't (and what's wrong with your argument that they should), which could probably be better answered with a discussion on the NumPy list than a StackOverflow question that asks something tangential. — abarnert, Apr 25 '15 at 01:10
A few observations: you can get the same benefits and a whole lot more by doing the math symbolically as far as possible before switching to numerics, or by using an arbitrary-size type instead of an IEEE double. Also, you could easily build a Python class that acts like float but handles your LFV, then work through some use cases, after which you'll likely either understand why it's not a good idea or be able to explain why it is in a proposal to extend NumPy. — abarnert, Apr 25 '15 at 01:13
One more thing: LFV isn't really the largest finite value; it's _all_ finite values larger than the penultimate finite value, not just the first. You could _call_ that a value, but that means that 10**500 == 10**5000, and now your arithmetic is far more broken as far as consistency than IEEE arithmetic, so I doubt you want that. — abarnert, Apr 25 '15 at 01:16
`LFV - 1` would also be undefined, because LFV could refer to any value larger than the largest representable explicit value but smaller than infinity. In fact, pretty much anything involving LFV apart from `LFV * 0` would be undefined. I can't really see much benefit over just using infinity. — ali_m, Apr 25 '15 at 01:44
In a way, LFV actually exists in IEEE 754, in the form of infinity with the [overflow flag](https://en.wikipedia.org/wiki/IEEE_floating_point#Exception_handling) set. The reason it's not a separate value is probably its limited usefulness. If you need to have the multiply-by-zero case handled properly, you'll just have to manually check the flag after the operation that could potentially overflow. You can get numpy to raise an exception or call a function on overflow with [`seterr`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.seterr.html). — dddsnn, Apr 25 '15 at 02:23
@dddsnn (+1) This is a really useful comment. Much appreciated. If you had made that an answer, it would currently be in the running for being accepted. — Josh, Apr 25 '15 at 02:30
It might be fun to point out that [Symmetric Level Index arithmetic](http://en.wikipedia.org/wiki/Symmetric_level-index_arithmetic) has the wonderful property of being *immune* to underflow and overflow - except for division by 0 (which is always exactly 0 due to the lack of underflow). Unfortunately it's not exactly well-known. — Veedrac, Apr 26 '15 at 10:40

dddsnn · Accepted Answer · 2015-04-25T23:57:17.787

So... I got curious and dug around a little.

As I already mentioned in the comments, a "largest finite value" kind of exists in IEEE 754, if you consider the exception status flags. A value of infinity with the overflow flag set corresponds to your proposed LFV, with the difference that the flag is only available to be read out after the operation, instead of being stored as part of the value itself. Which means you have to manually check the flag and act if overflow occurs, instead of just having LFV*0 = 0 built in.

There is quite an interesting paper on exception handling and its support in programming languages. Quote:

The IEEE 754 model of setting a flag and returning an infinity or a quiet NaN assumes that the user tests the status frequently (or at least appropriately.) Diagnosing what the original problem was requires the user to check all results for exceptional values, which in turn assumes that they are percolated through all operations, so that erroneous data can be flagged. Given these assumptions, everything should work, but unfortunately they are not very realistic.

The paper also bemoans the poor support for floating point exception handling, especially in C99 and Java (I'm sure most other languages aren't better though). Given that in spite of this, there are no major efforts to fix this or create a better standard, to me seems to indicate that IEEE 754 and its support are, in a sense, "good enough" (more on that later).

Let me give a solution to your example problem to demonstrate something. I'm using numpy's seterr to make it raise an exception on overflow:

import numpy as np

def exp_then_mult_naive(a, b):
    err = np.seterr(all='ignore')
    x = np.exp(a) * b
    np.seterr(**err)
    return x

def exp_then_mult_check_zero(a, b):
    err = np.seterr(all='ignore', over='raise')
    try:
        x = np.exp(a)
        return x * b
    except FloatingPointError:
        if b == 0:
            return 0
        else:
            return exp_then_mult_naive(a, b)
    finally:
        np.seterr(**err)

def exp_then_mult_scaling(a, b):
    err = np.seterr(all='ignore', over='raise')
    e = np.exp(1)
    while abs(b) < 1:
        try:
            x = np.exp(a) * b
            break
        except FloatingPointError:
            a -= 1
            b *= e
    else:
        x = exp_then_mult_naive(a, b)
    np.seterr(**err)
    return x

large = np.float_(710)
tiny = np.float_(0.01)
zero = np.float_(0.0)

print('naive: e**710 * 0 = {}'.format(exp_then_mult_naive(large, zero)))
print('check zero: e**710 * 0 = {}'
    .format(exp_then_mult_check_zero(large, zero)))
print('check zero: e**710 * 0.01 = {}'
    .format(exp_then_mult_check_zero(large, tiny)))
print('scaling: e**710 * 0.01 = {}'.format(exp_then_mult_scaling(large, tiny)))

# output:
# naive: e**710 * 0 = nan
# check zero: e**710 * 0 = 0
# check zero: e**710 * 0.01 = inf
# scaling: e**710 * 0.01 = 2.233994766161711e+306

exp_then_mult_naive does what you did: expression that will overflow multiplied by 0 and you get a nan.
exp_then_mult_check_zero catches the overflow and returns 0 if the second argument is 0, otherwise same as the naive version (note that inf * 0 == nan while inf * positive_value == inf). This is the best you could do if there were a LFV constant.
exp_then_mult_scaling uses information about the problem to get results for inputs the other two couldn't deal with: if b is small, we can multiply it by e while decrementing a without changing the result. So if np.exp(a) < np.inf before b >= 1, the result fits. (I know I could check if it fits in one step instead of using the loop, but this was easier to write right now.)

So now you have a situation where a solution that doesn't require an LFV is able to provide correct results for more input pairs than one that does. The only advantage an LFV has here is using fewer lines of code while still giving a correct result in that one particular case.

By the way, I'm not sure about thread safety with seterr. So if you're using it in multiple threads with different settings in each thread, test it out before to avoid headache later.

Bonus factoid: the original standard actually stipulated that you should be able to register a trap handler that would, on overflow, be given the result of the operation divided by a large number (see section 7.3). That would allow you to carry on the computation, as long as you keep in mind that the value is actually much larger. Although I guess it could become a minefield of WTF in a multithreaded environment, never mind that I didn't really find support for it.

To get back to the "good enough" point from above: In my understanding, IEEE 754 was designed as a general purpose format, usable for practically any application. When you say "the same issue frequently arises in many different settings", it is (or at least was) apparently not frequently enough to justify inflating the standard.

Let me quote from the Wikipedia article:

[...] the more esoteric features of the IEEE 754 standard discussed here, such as extended formats, NaN, infinities, subnormals etc. [...] are designed to give safe robust defaults for numerically unsophisticated programmers, in addition to supporting sophisticated numerical libraries by experts.

Putting aside that, in my opinion, even having NaN as a special value is a bit of a dubious decision, adding an LFV isn't really going to make it easier or safer for the "numerically unsophisticated", and doesn't allow experts to do anything they couldn't already.

I guess the bottom line is that representing rational numbers is hard. IEEE 754 does a pretty good job of making it simple for a lot of applications. If yours isn't one of them, in the end you'll just have to deal with the hard stuff by either

using a higher precision float, if available (ok, this one's pretty easy),
carefully selecting the order of execution such that you don't get overflows in the first place,
adding an offset to all your values if you know they're all going to be very large,
using an arbitrary-precision representation that can't overflow (unless you run out of memory), or
something else I can't think of right now.

Good answer. By the way, the [`np.errstate`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.errstate.html) context manager is a slightly nicer way to temporarily change the floating-point error handling for a block of code. — ali_m, Apr 25 '15 at 22:34
@dddsnn (accepted) Thanks for your thoughts on this issue. Your response helped me understand what was currently available and gave some context for this problem. Your answer is appreciated. Other commenters thoughts also appreciated. — Josh, Apr 27 '15 at 01:14

Do numerical programming languages distinguish between a "largest finite number" and "infinity"?

1 Answers1