14

The following (C99 and newer) code wants to compute a square, restricted to the same number of bits as the original fixed-width type.

    #include <stdint.h>
     uint8_t  sqr8( uint8_t x) { return x*x; }
    uint16_t sqr16(uint16_t x) { return x*x; }
    uint32_t sqr32(uint32_t x) { return x*x; }
    uint64_t sqr64(uint64_t x) { return x*x; }

Problem is: depending on int size, some of the multiplications can be performed on arguments promoted to (signed) int, with result overflowing a (signed) int, thus undefined result as far as the standard is concerned; and conceivably wrong result, especially on (increasingly rare) machines not using two's complement.

If int is 32-bit (resp. 16-bit, 64-bit, 80 or 128-bit), that occurs for sqr16 (resp. sqr8, sqr32, sqr64) when x is 0xFFFFF (resp. 0xFF, 0xFFFFFFFF, 0xFFFFFFFFFFFFFFFF). Neither of the 4 functions is formally portable under C99 !!

Does C11 or later, or some edition of C++, fix that unfortunate situation?


A simple, working solution is:

    #include <stdint.h>
     uint8_t  sqr8( uint8_t x) { return 1u*x*x; }
    uint16_t sqr16(uint16_t x) { return 1u*x*x; }
    uint32_t sqr32(uint32_t x) { return 1u*x*x; }
    uint64_t sqr64(uint64_t x) { return 1u*x*x; }

This is standards-conformant because 1u is not promoted to int and remains unsigned; thus the left multiplication, then the right one, are performed as unsigned, thus are well-defined to yield correct result in the necessary number of low-order bits; same for the final implicit cast to the result width.

Updated: As suggest in comment by Marc Glisse, I tried this variant with eight compilers (three versions of GCC for x86 starting with 3.1, MS C/C++ 19.00, Keil ARM compiler 5, two Cosmic compilers for ST7 variants, Microchip MCC18). They all generated the very same code as the original (with the optimizations I use in release mode for actual projects). However, compilers could conceivably generate worse code than the original; and I have several others of my embedded compilers to try, including some 68K and PowerPC ones.

What other options do we have, making a reasonable balance between likely better performance, readability, and simplicity?

Community
  • 1
  • 1
fgrieu
  • 2,724
  • 1
  • 23
  • 53
  • 3
    Cast wider for the multiplication, then cast back to the narrower type `uint8_t sqr8(uint8_t x) { return (uint8_t)((uint16_t)x * (uint16_t)x); }` – Toby Nov 25 '16 at 10:49
  • 1
    You're right about the general observation: the fixed-size unsigned integral type aliases are not suitable for arithmetic operations with modular behaviour. Use `unsigned long` or something like that. Tthe point is that you need to control the type's conversion rank, and the sized aliases contain *no* conversion rank information. Even `uintmax_t` could be an alias for `unsigned char`. – Kerrek SB Nov 25 '16 at 10:50
  • 3
    @Toby: My understanding is that even with the casts, the arguments will *still* be promoted to `int`. – Bathsheba Nov 25 '16 at 10:52
  • 2
    Yes they will, but as long as you always cast to the *wider* type for the multiplication then the signed-ness should not affect it. (eg squaring the max `uint16_t` value of 65535 results in a value less than the max `uint32_t` value). – Toby Nov 25 '16 at 10:57
  • 1
    @KerrekSB “Even `uintmax_t` could be an alias for `unsigned char`” — But certainly only if its size is equal to `unsigned int`’s, in which case no integer promotion should happen, right? – Konrad Rudolph Nov 25 '16 at 10:58
  • @Toby: You are right! That solution works up to `sqr32`; it is not very efficient, though. I think that truncation to the narrower type is well defined, because that narrower type is explicitly unsigned; in that case, the result must be exact modulo 2 to the number of bits. – fgrieu Nov 25 '16 at 10:58
  • Need to watch for truncation when casting back to the narrower type though... or do overflow check before the multiplication. And no, efficiency is not it's strong suit here. – Toby Nov 25 '16 at 10:59
  • @KonradRudolph: That's the point - promotion still happens. If all your types have the same size, then `unsigned char` gets promoted to `unsigned int`. Sure, it doesn't change the value (promotion never does) or the size of the resulting type, but it's still promotion. – Kerrek SB Nov 25 '16 at 11:01
  • 1
    @KerrekSB Ah, I was under the impression that promotion only happened for smaller types — this is the usual wording. I may be misinterpreting what “smaller” means then: does it refer to the rank of the types rather than its width in bytes? Regardless, at any rate, it seems irrational that promotion should ever happen for `uintmax_t`: by definition, it should *not* be smaller than any other types, in particular `unsigned int`, regardless of how “smaller” is defined. – Konrad Rudolph Nov 25 '16 at 11:05
  • 1
    @KonradRudolph: So, there's one world which contains all the things that seem sensible, in which we call our mums and eat greens and get free healthcare and integer arithmetic works, and there's another world of ISO C and C++... – Kerrek SB Nov 25 '16 at 11:09
  • 1
    @KerrekSB *Or* you could have pointed me to the passage in §4.15/1.5 which states that “The rank of any standard integer type shall be greater than the rank of any extended integer type with the same size” ;-) I’m no longer active following the standards debates but I’m assuming that there’s a good reason for this rule. – Konrad Rudolph Nov 25 '16 at 11:16
  • 3
    What compiler generates what bad code on your second version? – Marc Glisse Nov 25 '16 at 11:17
  • @KonradRudolph: That doesn't seem to have anything to do with your previous problem: `uintmax_t` needn't be an extended type. As I said, it could be `unsigned char`. – Kerrek SB Nov 25 '16 at 11:30
  • 1
    Note that if you use `uint_fast8_t` for `x`, it is very *likely* that it it's using type definition of `unsigned int`. Still not guaranteed though, so it's not perfect cure. – user694733 Nov 25 '16 at 11:58
  • 2
    Integer promotions aside, note that every single one of those functions will overflow if `x > sqrt(UINTn_MAX)`. This is no fault of the C standard, but of the algorithms. You need to either build in a run-time check or document the maximum value that the functions can handle. For example, in case of `uint8_t`, the function call doesn't make any sense for values larger than 16. I really don't see how it would be useful to get the value 33 for `sqr8(17)`. I would rather prefer to get the value 289. Which obviously doesn't fit in a uint8_t, so the algorithm is the problem. – Lundin Nov 25 '16 at 12:42
  • The practical/cynical solution to the promotion problem is to simply never use types smaller than uint32_t for any form of arithmetic. This solution works excellent, as long as you don't insist on using crappy 8 and 16 MCUs in the year 2016, when there's ARM Cortex M0 available at ridiculously cheap prices. – Lundin Nov 25 '16 at 12:47
  • Is the "worse" code that is produced worse because it is simply correct, and the other one has undefined behavior because of signed integer overflow? You really have to define what you expect these functions to do for large unsigned values. – Jens Gustedt Nov 25 '16 at 14:05
  • @Marc Glisse: you where right to ask _"What compiler generates what bad code on your second version?"_. See updated question; that was a pleasant surprise to me. – fgrieu Nov 25 '16 at 18:32
  • 1
    `1u*x*x;` is a good solution IMO, if you want defined wraparound behaviour on "overflow". – M.M Dec 03 '16 at 06:20
  • Note, gcc and clang [provide builtins for checking for overflow during arithmetic operations](http://stackoverflow.com/a/32317442/1708801). – Shafik Yaghmour Dec 06 '16 at 23:24

3 Answers3

6

You have identified a fundamental short-coming of the integer type aliases in <stdint.h>: They do not contain any information about the type's conversion rank. Therefore, you have no control over whether values of those types undergo integral promotions, and as you observe correctly, the expression may have undefined behaviour when the integral promotion results in a signed type.

In short: you cannot use the alias types for the purpose of performing the usual arithmetic operations modulo 2N. You need to use a type whose (known!) conversion rank is at least that of int.

The solution in general would be to convert your operands to the smallest appropriate of unsigned int, unsigned long int or unsigned long long int (provided your platform doesn't have extended integral types), then evaluate the expression, and then convert back to the original type (which has the correct modular behaviour). In C++ you can probably write a type trait that figures out the correct type in a portable way.

As a cheaper trick, and again assuming the absence of (wider) extended integral types, you could just promote everything to unsigned long long int and hope that your compiler makes the computation in an efficient way.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • I'd love to see that _"C++ type trait"_ idiom! – fgrieu Nov 25 '16 at 11:16
  • In C++, you don't even necessarily need a type trait: it can be trivially written as `decltype(1u*x)` (where the multiplication is never executed). –  Nov 25 '16 at 11:23
  • 1
    @KerrekSB That *is* an unsigned type. –  Nov 25 '16 at 11:30
  • @hvd: I'd still write it as `common_type_t`, though. – Kerrek SB Nov 25 '16 at 11:33
  • (Small [demo](https://godbolt.org/g/q84SZk) of the `unsigned long long` recommendation.) – Kerrek SB Nov 25 '16 at 11:38
  • 2
    This isn't really a short-coming of stdint.h but of the integer promotion rule. It creates an inconsistency between small and large integer types. If the integer promotion rule didn't exist, and more importantly, if it didn't silently change signedness of the type, C programs would have a whole lot less subtle bugs. At the expense of more blatantly obvious bugs caused b integer overflows. – Lundin Nov 25 '16 at 12:31
4

You can't avoid the inevitable type promotion to int for narrower unsigned types.

It's more of a property of the multiplication operator than anything else.

In order to avoid undefined behaviour corner cases, the only thing you can do is never use multiplication when using unsigned types where the square of their maximum can overflow the int.

Luckily (unless you are working in the embedded world were you can always consult the documentation for the precise behaviour), you can largely consign unsigned short to history: int and its unsigned cousin will most likely be no slower, and possibly faster.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • Right; I'm asking _how_ to do this. Including, using a better version of the language definition; or some reasonably efficient and elegant idiom. The `1u*` idiom is reasonably elegant, and is perfectly standards-compliant (according to my understanding), but fails on the _efficient_ test, with many compilers. – fgrieu Nov 25 '16 at 10:48
  • You can't. It's not possible. That's what I'm *trying* to say. – Bathsheba Nov 25 '16 at 10:48
1

How to force unsigned arithmetic on fixed-width types?
What other options do we have, ...?

By using fixed width types that are at least as wide as unsigned for the type of the function's arguments.

This causes the conversion to at least unsigned as part of the parameter passing. The types of the formal parameters and the returned types remain the classic "fixed-width types". The function's actual augments are also fixed-width types, but maybe wider fixed-width types.

#if UINT16_MAX >= UINT_MAX
   typedef uint8  uint16_t
   typedef uint16 uint16_t
   typedef uint32 uint32_t
   typedef uint64 uint64_t
#elif UINT32_MAX >= UINT_MAX 
   typedef uint8  uint32_t
   typedef uint16 uint32_t
   typedef uint32 uint32_t
   typedef uint64 uint64_t
#elif UINT64_MAX >= UINT_MAX 
   typedef uint8  uint64_t
   typedef uint16 uint64_t
   typedef uint32 uint64_t
   typedef uint64 uint64_t
#endif

uint16_t sqr16(uint16 x) { return x*x; }
uint16_t sqr32(uint32 x) { return x*x; }
uint16_t sqr64(uint64 x) { return x*x; }

// usage
uint16_t x16 = ...;
uint32_t x32 = ...;
uint64_t x64 = ...;
x16 = sqr16(x16);
x32 = sqr32(x32);
x64 = sqr64(x64);

Depending on the function, this is a problem though if the function is called with a wider type like below. uint16_t foo16(uint16 x) might not have taken precautions against receiving a value outside uint16_t range.

x16 = foo16(x32);

If all this better? I still prefer the explicit 1u as in

uint16_t sqr16(uint16_t x) { return 1u*x*x; }
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256