7

I was trying to understand the differences between res1 and res2 in the code below:

#include <iostream>

int main()
{   
    int x = 1;
    int y = 0;
    
    double res1 = double(x)/y;      // OK: evaluates to Inf
    int res2 = x/y;                 // run-time error: Floating point exception
    // 1/0;                         // g++ warning: division by zero [-Wdivision-by-zero]  
    
    std::cout << res1;
    
    return 0;
}

From what I understand, division by zero is undefined behaviour in the C++ standard and the reason for the difference between res1 and res2 is due to my machine implementing IEEE 754 for double, which requires division by zero to return Inf or -Inf.

But now I'm wondering why the standard has to make any pronouncements about division by zero in the first place. This answer says it's to accommodate the various different architectures that implement C++, but I'm not sure - isn't division by zero more of a run-time concern? Especially if the compiler is unlikely to be able to detect it in most cases without evaluating the denominator (I think this is what happens in the example above). Of course, if I try something like 1/0, then g++ gives a warning, but in most cases, we would expect the denominator to be a more complex expression.

user438383
  • 5,716
  • 8
  • 28
  • 43
user51462
  • 1,658
  • 2
  • 13
  • 41
  • 2
    Division is a well defined arithmetic operation, and you would expect it to behave the same on every architecture. With the exception of division by zero, which is not even mathematically well defined. Division should not depend on runtime, except for this special case. Do you expect users to check their runtime everytime they want to do (valid) division? That would be nightmare. – freakish Feb 21 '23 at 11:39
  • 1
    Users usually don't like when their programs behave weirdly or outright crashes. Having the compiler detect a problem saves you from passing on that problem to the user to find. But as you said, it's not always for the compiler to detect it, so when you have any kind of input used (from user, from database, from *anywhere*) you should add code to make sure such things doesn't happen. – Some programmer dude Feb 21 '23 at 11:40
  • "isn't division by zero more of a run-time concern?" and thats exactly what the difference between undefined and defined behavior is about: The observable behavior at runtime. undefined behavior is most often that mistakes that compiler cannot / is not required to diagnose. YOu seem to expect it to be diagnosable always, which it isnt – 463035818_is_not_an_ai Feb 21 '23 at 11:47
  • The standard notes that division by zero is undefined behavior. Additionally the standard makes a special note about **constant** expressions that invoke undefined behavior in `[expr.const]`. Those would normally be evaluated at compile time. – teapot418 Feb 21 '23 at 11:52
  • "why the standard has to make any pronouncements about division by zero in the first place" If the standard didn't say anything about how division by zero behaves, it would still be undefined behavior. That's what UB means: a situation where the standard does not define how the program should behave. The reason that it's explicitly called out as undefined in this case, as opposed to not saying anything about it at all, is probably to make it explicit and clear to the reader that division by zero is not covered by the rest of the definition. – sepp2k Feb 21 '23 at 11:55
  • Not a real answer but: the build machine during constant evaluation and the target machine during run-time could have unpredictably different behaviour. – lesderid Feb 22 '23 at 12:53

3 Answers3

3

As I understand your question you seek to know, why explicitly state that division by zero is undefined, and not just omit it from the standard altogether? Here I think there is an important distinction between omitting something from the standard, and specifying that something is not defined.

With regards to evaluation of expressions, it is essentially §5.4 that we are interested in:

§5.4
If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined. [ Note: most existing implementations of C++ ignore integer overflows. Treatment of division by zero, forming a remainder using a zero divisor, and all floating point exceptions vary among machines, and is usually adjustable by a library function. — end note ]

Source: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf

If the standard did not include this statement, it would be the interpretation of the reader to basically figure out the same. So the reader would have to find out that the corner case existed, and that it would lead to undefined behavior.

In the C++ standard undefined behavior is also important for other reasons, special rules apply for things that are undefined behavior, such as defined by §5.19 (Constant expressions) that — among other things — state that:

A conditional-expression e is a core constant expression unless the evaluation of e, following the rules of the abstract machine (1.9), would evaluate one of the following expressions:
...
— an operation that would have undefined behavior [ Note: including, for example, signed integer overflow (Clause 5), certain pointer arithmetic (5.7), division by zero (5.6), or certain shift operations (5.8) — end note ];

So knowing when something is undefined can be important for the reader of the C++ standard.

Another reason why division by zero needs to be undefined behavior, is exactly as you mentioned, that it occurs at runtime. If it had defined behavior (which it could) the compiler would have to handle all integer division operations, and wrap them with division by zero handling. Since processors could potentially handle division by zero differently and the C++ authors, did not want to enforce an overhead in handling the situation, they deliberately states that it is undefined.

So in a way you could say that it was a run-time concern, but the compiler produces the code that executes the run-time behavior, and the division by zero has the potential to break the normal scope of execution, and is important in other aspects of the language (see comment above about constant expressions), therefore it needs to be defined. But there are also other things that are undefined, that you could say were "run-time concerns" integer overflows for one. As a developer I love to have an idea about what happens, and if I can assume anything about what happens in a given situation. With integer overflow, I would perhaps expect a specific behavior as logical given that it on my architecture behaved in a specific way. In that regard I want my standard to guide me as to what to expect, and what not to expect.

Tommy Andersen
  • 7,165
  • 1
  • 31
  • 50
1

But now I'm wondering why the standard has to make any pronouncements about division by zero in the first place.

The standard makes no pronouncements about division by zero. That's what "undefined behavior" means. Whenever a situation arises in which the execution of a program for a given set of inputs would have behavior that is either explicitly or implicitly not defined by the standard, then the standard does not make any demands of the C++ implementation (compiler + standard library) at all with regards to that program under theses inputs and specifying the behavior of C++ implementations with regards to given programs in their source code form is the only thing that the standard does.

Saying explicitly that division by zero has undefined behavior is 100% equivalent to just not saying what the result of division by zero should be. Making it explicit only assures the reader that this is intentional and avoids misunderstandings.

isn't division by zero more of a run-time concern?

Yes, it is a run-time concern, which is why the term "undefined behavior" is used. For analogous situations involving only compile-time questions regardless of program inputs, the standard uses "ill-formed, no diagnostic required" instead.

If the standard doesn't specify how a given program should behave at run-time, then that means that the behavior of the program is undefined. There is also "unspecified" which the standard uses to allow for multiple possible, but locally confined, behaviors and "implementation-defined" for "unspecified" behavior that should additionally be documented by the C++ implementation (and is typically expected to be consistent).

It is not like the standard specifies that the compiler should translate e.g. a / on float to a machine instruction representing division of floating point numbers of appropriate size and then leaves the exact behavior to the particular machine.

The standard specifies how a given program would behave for give inputs in terms on an abstract machine that has nothing to do with any real CPU and only specifies that C++ implementations should translate/run programs in such a way that observable behavior (e.g. IO) of well-formed programs matches those that the described abstract machine would have. As far as the standard is concerned there doesn't even need to be a compiler translating to some CPU's instruction set. C++ could be interpreted.

Especially if the compiler is unlikely to be able to detect it in most cases without evaluating the denominator (I think this is what happens in the example above). Especially if the compiler is unlikely to be able to detect it in most cases without evaluating the denominator (I think this is what happens in the example above).

Because the standard doesn't define the behavior, the compiler does not need to consider at all whether or not the program contains it. If the compiler wants to do some analysis (to whatever degree) in order to warn about it, it can, but that has nothing to do with the standard. There is no imperative for the standard to diagnose undefined behavior. There are no imperatives on undefined behavior at all. But a user of the compiler will expect it to warn about easily recognizable mistakes.

From what I understand, division by zero is undefined behaviour in the C++ standard and the reason for the difference between res1 and res2 is due to my machine implementing IEEE 754 for double, which requires division by zero to return Inf or -Inf.

Because the C++ standard says that it is undefined, the compiler can do anything without its conformance with the C++ standard being impacted. It doesn't matter what the CPU implements, if the compiler wants to it can ignore that and still optimize under the assumption that floating point division by zero never happens. And even if the CPU does not support it, the compiler could define division by zero to have some well-defined meaning. But of course, if the compiler wants to be IEEE 754 conforming, then it must assure that division by zero behaves as that specifies and it may rely on the CPU's behavior to implement it.

user17732522
  • 53,019
  • 2
  • 56
  • 105
0

"isn't division by zero more of a run-time concern": right, it is often a run-time concern, namely in all cases that it is not possible - or too costly - to find out at compile-time.

But this in no way implies that a run-time concern does not have to be specified ! The user of the language deserves to be informed of how the program will execute. (Even if the specification is UB.)