13

All operations on "standard" signed integer types in C (short, int, long, etc) exhibit undefined behaviour if they yield a result outside of the [TYPE_MIN, TYPE_MAX] interval (where TYPE_MIN, TYPE_MAX are the minimum and the maximum integer value respectively. that can be stored by the specific integer type.

According to the C99 standard, however, all intN_t types are required to have a two's complement representation:

7.8.11.1 Exact-width integer types
1. The typedef name intN_t designates a signed integer type with width N , no padding bits, and a two’s complement representation. Thus, int8_t denotes a signed integer type with a width of exactly 8 bits.

Does this mean that intN_t types in C99 exhibit well-defined behaviour in case of an integer overflow? For example, is this code well-defined?

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

int main(void)
{
    printf("Minimum 32-bit representable number: %" PRId32 "\n", INT32_MAX + 1);
    return 0;
}
Mysticial
  • 464,885
  • 45
  • 335
  • 332
Alexandros
  • 3,044
  • 1
  • 23
  • 37

2 Answers2

14

No, it doesn't.

The requirement for a 2's-complement representation for values within the range of the type does not imply anything about the behavior on overflow.

The types in <stdint.h> are simply typedefs (aliases) for existing types. Adding a typedef doesn't change a type's behavior.

Section 6.5 paragraph 5 of the C standard (both C99 and C11) still applies:

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

This doesn't affect unsigned types because unsigned operations do not overflow; they're defined to yield the wrapped result, reduced modulo TYPE_MAX + 1. Except that unsigned types narrower than int are promoted to (signed) int, and can therefore run into the same problems. For example, this:

unsigned short x = USHRT_MAX;
unsigned short y = USHRT_MAX;
unsigned short z = x * y;

causes undefined behavior if short is narrower than int. (If short and int are 16 and 32 bits, respectively, then 65535 * 65535 yields 4294836225, which exceeds INT_MAX.)

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • Is this mentioned implicitly or explicitly somewhere in the standard? – Alexandros Feb 20 '12 at 19:51
  • 1
    6.5p5, which I quoted, mentions it explicitly. Nothing else in the standard says implies that 6.5p5 doesn't apply to the `intN_t` types. And nothing in the standard defines the common wraparound behavior for 2's-complement types; for the behavior to be defined, the standard would have to define it somewhere. – Keith Thompson Feb 20 '12 at 19:53
  • 1
    "This doesn't affect unsigned types because ..." Should not this be qualified as unsigned types narrower that `int` are promoted to `int` (and not `unsigned`) first and then they suffer the same behavior limitations? – chux - Reinstate Monica Jun 14 '15 at 15:50
4

Although storing an out-of-range value to a signed type stored in memory will generally store the bottom bits of the value, and reloading the value from memory will sign-extend it, many compilers' optimizations may assume that signed arithmetic won't overflow, and the effects of overflow may be unpredictable in many real scenarios. As a simple example, on a 16-bit DSP which uses its one 32-bit accumulator for return values (e.g. TMS3205X), int16_t foo(int16_t bar) { return bar+1;} a compiler would be free to load bar, sign-extended, into the accumulator, add one to it, and return. If the calling code were e.g. long z = foo(32767), the code might very well set z to 32768 rather than -32768.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 1
    @BenVoigt: Could the example have been corrected as `long foo(int bar) {return bar+1;}`? I've used a compiler for a DSP where `int` was 16 bits, but the processor's only 32-bit accumulator was used to return function results. If such a function were called by `long z = foo(32767)+1`; the compiler would likely assign `z` a value of +32769 rather than -32767 [the calling code would add one to the accumulator, and then store both halves]. I don't remember the exact cases that yielded 'interesting' behaviors, but there were some. – supercat Aug 26 '13 at 15:16
  • @BenVoigt: Also, I find the language you cite curious. Would every possible "value" of an `int16` expression (including implementation-defined ones) be required to be one that could be stored without modification in an `int16` variable? I find it interesting that out-of-range arithmetic is totally "undefined behavior", but out-of-range conversions allow "implementation-defined signals". – supercat Aug 26 '13 at 15:22
  • Yes, that example seems like a reasonable case where the undefined behavior could do something entirely unexpected in light of the type system. For the first example, yes I believe that if there is no signal, the result must be a valid `int16_t` value. The most reasonable actual behaviors are bitmasking (modulo arithmetic just like unsigned) and saturation (which some hardware does implement). – Ben Voigt Aug 26 '13 at 15:28
  • @BenVoigt: Incidentally, for reasons somewhat alluded to by my example, I'm somewhat dubious of efforts to replace all occurrences of types like `unsigned int` with types like `uint32_t`, given that `int` and `unsigned int` have a semantic meaning as the minimum sizes to which signed or unsigned integral expressions will be promoted. If code compares an `int` and an `unsigned int`, the comparison is specified to be unsigned; by comparison, a comparison between `int16_t` and `uint16_t`, or between `int32_t` and `uint32_t`, could be signed or unsigned depending upon size of int. – supercat Aug 26 '13 at 17:46
  • 1
    @BenVoigt: Another question alluded to an even nastier example, btw: Given `uint32_t x=0xFFFFFFFF`, would `x=(x*x);` set `x` to 1 or would it cause Undefined Behavior? Would `x*=x;` be any different? – supercat Aug 26 '13 at 18:33
  • 1
    What are the options? 1. it got promoted to its own type. Then the result is `((2**32-1) * (2**32-1)) % (2**32)` which is `-1`. 2. It got promoted to a larger signed type. Then the result is `((2**32-1) * (2**32-1))` iff that result is representable, and otherwise UB. No, `x*=x` isn't different, except that if `int` can represent `((2**32-1) * (2**32-1))`, it will get stored in `x` as `1` (because coercion to `unsigned` types is performed using modulo arithmetic) – Ben Voigt Aug 26 '13 at 18:41
  • 1
    @BenVoigt: That's why I offered it up as a nasty example. One wouldn't expect code which works on machines with a 32-bit `int` type to cause overflow on machines with 64-bit `int`, but in the above example it can. – supercat Aug 26 '13 at 18:48