Signed integer overflow, intrinsics, and undefined behaviour

Question

Is the very simple code below susceptible to undefined behaviour as the integer overflows as a result of the operation?

static volatile LONG x = LONG_MAX;

InterlockedIncrement(&x);

According to the standard, signed integer overflow is undefined behaviour. However, here we are out of the standard, as we are calling compiler's intrinsic function, which inlines to some assembly. Also, the value of x is not used anywhere (the function is just used as a memory barrier).

An answer to a similar question suggests this is not UB.

I'd argue it doesn't matter when or where the overflow happens. If it happens then it's UB from the C++ standard point of view. The behavior may be defined from the point of view of the compiler and the code it generates though, but from a strict C++-standard perspective it's still UB. — Some programmer dude, Mar 01 '19 at 07:50
The way I would look at it is: `InterlockedIncrement` still has to return a `LONG`. What value will it return in this case? — P.W, Mar 01 '19 at 07:57
@P.W It could wrap and consequently return `LONG_MIN`, right? Just like .NET's [Interlocked.Increment()](https://learn.microsoft.com/en-us/dotnet/api/system.threading.interlocked.increment?redirectedfrom=MSDN&view=netframework-4.7.2#System_Threading_Interlocked_Increment_System_Int32__) does. — Brandlingo, Mar 01 '19 at 08:01
@MatthäusBrandl The problem is that the "wrapping" is not defined by the C++ specification. — Some programmer dude, Mar 01 '19 at 08:02
It is not terribly undefined behavior these days. They don't build a lot of processors anymore that use one's-complement representation or trap on signed overflow. The logic error it produces is certainly the bigger issue. You'd of course favor MemoryBarrier() if that is all you need. — Hans Passant, Mar 01 '19 at 08:03
Technically undefined behavior in C++, but on x86 `LOCK inc` has well defined behavior. — selbie, Mar 01 '19 at 08:04
@Someprogrammerdude: atomic operations are executed as if the number was unsigned (so atomic overflow is not UB). This is defined by the standard. But `InterlockedIncrement` is not covered by the standard, so MS doc have to be checked. But I'm pretty sure that currently this is not UB, and does what we expect: two's complement wrap on all architectures that MSVC supports. — geza, Mar 01 '19 at 08:05
Unfortunately, MS doc does not say anything about overflows or underflows. Perhaps it does not say anything, because there is nothing to say as the operation is perfectly safe :-) — Krzysiek Karbowiak, Mar 01 '19 at 08:14
@HansPassant: sorry about being super-picky, but it's "ones' complement" (and "two's complement") (and it is not just about number representation. Increment could saturate, for example). — geza, Mar 01 '19 at 08:20

score 2 · Accepted Answer · answered Mar 01 '19 at 11:07

2

I claim there's no UB here, neither per the language standard (the standard doesn't cover this function/intrinsic) nor per the implementation, and there's a simple rollover.

Here's my reasoning...

InterlockedIncrement() is conceptually very simple and if it had a special case, it would be very hard to miss it and fail to document it. And the documentation hasn't mentioned any special case here for some 15+ years.

How would you implement it anyway?

If you're on the 80486 or better, the most natural implementation uses the XADD instruction with the LOCK prefix that atomically adds a value to a memory variable. The instruction by itself does not generate any overflow exception, however it does modify the EFLAGS register as does the regular addition instruction, ADD, so it's possible to detect an overflow and act on it. Specifically, you could throw in the INTO instruction to turn the overflow condition into an exception. Or you could use the conditional jump instruction, JO, to jump the overflow handler.

If you're on the 80386 or better, you can also use the XCHG instruction (the LOCK is implicit with this instruction), to make a loop that would try to atomically update a memory variable (this is how InterlockedExchange() and InterlockedCompareExchange() can be implemented, there's also a handier (for this purpose) CMPXCHG instruction since the 80486). In this case you'd need to perform the register increment as usual, with the ADD instruction or with the INC instruction, and you can optionally detect any overflow condition (in EFLAGS.OF) and handle it as mentioned earlier.

Now, would you want to throw INTO or JO into all instances of InterlockedIncrement()? Probably not, definitely not by default. People like their atomic operations small and fast.

This is the "immediate" UB. What about the "creeping" UB? If you had C code like this:

  int a = INT_MAX;
  if (a + 1 < a)
    puts("Overflow!");

You'd likely get nothing printed nowadays. Modern compilers know that a + 1 can't legally(!) overflow and so the condition in the if statement can be taken as false irrespective of the value of a.

Can you have a similar optimization with InterlockedIncrement()?

Well, given that the variable is volatile and can indeed change in a different thread at any moment, the compiler may not assume unchanged a from two memory reads of it (you'd likely write a + 1 < a or similar as multiple statements and each a would need to be fetched if it's volatile).

It would also be an odd context to try to make the optimization in.

answered Mar 01 '19 at 11:07

Alexey Frunze

61,140
12
83
180

From what I can tell, the Standard would allow an implementation to do anything it likes in response to `InterlockedIncrement(anything)` unless user code defines a function by that name (in which case that function should be called). Implementations would be free to specify that such a call will behave in some useful fashion, but the Standard says nothing about what it would do. – supercat Mar 01 '19 at 17:05
According to http://www.posix.nl/linuxassembly/nasmdochtml/nasmdoca.html, the current opcode for `cmpxchg` was new with Pentium. NASM uses `cmpxchg486` for the old `0F A7` opcode supported by some (but reportedly not all) 486 chips. Anyway, interesting point that you *could* implement a fairly safe atomic `inc` with an `xchg` (not `cmpxchg`) retry loop, if the only write access was other increment threads. You'd detect races (where cmpxchg would have failed) and calculate a new increment to attempt to add. (e.g. if you actually decreased the counter by 2, you then try to add 3 next time). – Peter Cordes Mar 02 '19 at 05:31
But I don't think there's a way to implement `lock xadd` with just `xchg`, only `lock inc` / `lock add 1` for the right total count, without the right "old value". It doesn't preserve the sequence of operations (some threads get a bogus old value). Or another way to look at it, if you want the old value from the first xchg, instead of the successful xchg, is that other threads see effects of you mucking around with the counter multiple times, not one atomic modification. If there are pure readers as well as the writers, they can see non-monotonic changes to the value. – Peter Cordes Mar 02 '19 at 05:34
Anyway, probably also worth pointing out that **C++ `atomic` is defined by ISO C++11 to be 2's complement with overflow defined as wrap-around.** Same for C11 `_Atomic int`. (except that in C++, `++a` and `a += 1` are defined by the standard (and maybe in some real headers) in terms of `fetch_add(1) + 1`, possibly making that part done with plain `int`. https://twitter.com/jfbastien/status/958026267420327936.). InterlockedIncrement is basically an intrinsic for `lock inc/add` or `lock xadd`, (hopefully the former if the result is unused), not subject to C signed integer rules. – Peter Cordes Mar 02 '19 at 05:42
@PeterCordes If you can use `XCHG` to implement a spinlock, you can effectively implement any atomic operation. – Alexey Frunze Mar 02 '19 at 07:13
Yes of course, but your answer sounded like you were describing a lock-free implementation. – Peter Cordes Mar 02 '19 at 07:16
@PeterCordes `InterlockedIncrement()` clearly predates C++11 with its atomics, but the latter suggests that it's a generalization of `InterlockedIncrement()` and such and standardization of common practice. – Alexey Frunze Mar 02 '19 at 07:21
Yes that, and that current implementations are built on top of intrinsics like `InterlockedIncrement`. Also that it would be a terrible design to have signed-overflow UB when the order of operations depends on performance details, and isn't controlled by the programmer. (Which is why C++ standardized that but not 2's complement or signed wraparound in general.) – Peter Cordes Mar 02 '19 at 08:01
Thank you for the detailed discussion. I'm inclined to think there is no UB, as what the `InterlockedIncrement` function does is out of the languages' scope (that is, there is no language instruction that does the increment, just some assembly instruction with predictable behaviour). – Krzysiek Karbowiak Mar 05 '19 at 21:01

Signed integer overflow, intrinsics, and undefined behaviour

1 Answers1