Which "C" implementation(s) do not implement modulo arithmetic for signed integers?

Question

In reference to C11 draft, section 3.4.3 and C11 draft, section H.2.2, I'm looking for "C" implementations that implement behaviour other than modulo arithmetic for signed integers.

Specifically, I am looking for instances where this is the default behaviour, possibly due to the underlying machine architecture.

Here's a code sample and terminal session that illustrates modulo arithmetic behaviour for signed integers:

overflow.c:

#include <stdio.h>
#include <limits.h>

int main(int argc, char *argv[])
{
    int a, b;
    printf ( "INT_MAX = %d\n", INT_MAX );
    if ( argc == 2 && sscanf(argv[1], "%d,%d", &a, &b) == 2 ) {
        int c = a + b;
        printf ( "%d + %d = %d\n", a, b, c );
    }
    return 0;
}

Terminal session:

$ ./overflow 2000000000,2000000000
INT_MAX = 2147483647
2000000000 + 2000000000 = -294967296

There's no signed overflow in this code, since a and b are coverted to ints before the addition. There's a truncation, which is impl defined. — Paul Hankin, Apr 26 '20 at 16:16
`gcc`. Compile with `-fsanitize=undefined`, the behavior you will see is standard-conforming. Alternatively, `gcc`. Compile with `-ftrapv`. The behavior you will see is *also* standard-conforming. — EOF, Apr 26 '20 at 16:21
So in other words - you're aware that this code is undefined behavior, and you're interested in knowing what various implementations actually do with it - in particular, which implementations do something other than printing -294967296. Is that your question? — Nate Eldredge, Apr 26 '20 at 16:24
@EOF, thanks, but I'm looking for actual implementations, not standards compliance. — rtx13, Apr 26 '20 at 16:24
@rtx13 `gcc` *is* an implementation (or part of one). I pointed out the "standard-conforming" to specify that the `gcc` *implementation* with these compiler flags *is a standard-conforming implementation*. — EOF, Apr 26 '20 at 16:25
On some compilers (gcc?), `int` may store an out-of-range value on overflow. As in, value larger than `INT_MAX`. That may easily happen if the `int` needs to be extended later (e.g. if added to a pointer, on a system witch `sizeof(int) < sizeof(char*)`): it may be extended earlier in some cases. — numzero, Apr 26 '20 at 16:28
@EOF yes you are correct. I will rephrase my question as I'm really looking for instances where this is the default behaviour, possibly due to the underlying machine architecture. — rtx13, Apr 26 '20 at 16:35
The Standard, §3.4.3, says "An example of undefined behavior is the behavior on integer overflow.". But §6.3.1.3, says "When a value with integer type is converted to another integer type..." which "...is signed and the value cannot be represented in it; either the result is implementation-defined...." As you observe, §H.2.2, says "An implementation that defines signed integer types as also being modulo need not detect integer overflow". Nevertheless, integer overflow isn't defined to be "implementation defined" -- if it was it might be easier to find what you seek :-( — Chris Hall, Apr 26 '20 at 17:47
@ChrisHall: On most platforms, implementations could at essentially no cost offer *some* very useful behavioral guarantees with regard to integer overflow even thoguh *precisely* specifying the behavior would be expensive. Unfortunately, even though the Standard intended that UB be interpreted grant license to implementations to process code in whatever fashion would best meet their customers' needs, some compiler writers use it as an excuse to treat their users' needs with contempt. — supercat, May 02 '20 at 20:13

Nate Eldredge · Answer 1 · 2020-04-26T17:59:18.423

Even with a "familiar" compiler like gcc, on a "familiar" platform like x86, signed integer overflow can do something other than the "obvious" twos-complement wraparound behavior.

One amusing (or possibly horrifying) example is the following (see on godbolt):

#include <stdio.h>

int main(void) {
    for (int i = 0; i >= 0; i += 1000000000) {
        printf("%d\n", i);
    }
    printf("done\n");
    return 0;
}

Naively, you would expect this to output

And with gcc -O0 you would be right. But with gcc -O2 you get

0
1000000000
2000000000
-1294967296
-294967296
705032704
...

continuing indefinitely. The arithmetic is twos-complement wraparound, all right, but something seems to have gone wrong with the comparison in the loop condition.

In fact, if you look at the assembly output, you'll see that gcc has omitted the comparison entirely, and made the loop unconditionally infinite. It is able to deduce that if there were no overflow, the loop could never terminate, and since signed integer overflow is undefined behavior, it is free to have the loop not terminate in that case either. The simplest and "most efficient" legal code is therefore to never terminate at all, since that avoids an "unnecessary" comparison and conditional jump.

You might consider this either cool or perverse, depending on your point of view.

(For extra credit: look at what icc -O2 does and try to explain it.)

I think a nicer example is with `unsigned mul_mod_65536(unsigned short x, unsigned short y) { return (x*y) & 0xFFFFu;}`. Your example would be consistent with a compiler that allocates a 128-bit registers to hold the value of `i`, and sometimes performs operations using just the bottom 32 bits and sometimes the whole thing. But gcc's treatment of the function above can disrupt the behavior of calling code even in cases where the result of the multiplication never ends up getting used. — supercat, May 02 '20 at 20:10

supercat · Answer 2 · 2020-05-02T20:14:07.260

On many platforms, requiring that a compiler perform precise integer-size truncation would cause many constructs to run less efficiently than would be possible if they were allowed to use looser truncation semantics. For example, given int muldiv(int x, ind y) { return x*y/60; }, a compiler that was allowed to use loose integer semantics could replace muldiv(x,240); with x<<2, but one which was required to use precise semantics would need to actually perform the multiplication and division. Such optimizations are useful, and generally won't pose problems if casting operators are used in cases where programs need mod-reduced arithmetic, and compilers process a cast to a particular size as implying truncation to that size.

Even when using unsigned values, the presence of a cast in (uint32_t)(uint32a-uint32b) > uint32c will make the programmer's intention clearer, and would be necessary to ensure that code will operate the same on systems with 64-bit int as on those with 32-bit int, so if one wants to test for integer wraparound, even on a compiler that would define the behavior, I would regard (int)(x+someUnsignedChar) < x as superior to `x+someUnsignedChar < x because the cast would let a human reader know the code was deliberately treating values as something other than normal mathematical integers.

The big problem is that some compilers are prone to generate code which behaves nonsensically in case of integer overflow. Even a construct like unsigned mul_mod_65536(unsigned short x, unsigned short y) { return (x*y) & 0xFFFFu; } which the authors of the Standard expected commonplace implementations to process as in a way indistinguishable from unsigned math, will sometimes cause gcc to generate nonsensical code in cases where x would exceed INT_MAX/y.

Which "C" implementation(s) do not implement modulo arithmetic for signed integers?

2 Answers2

Linked