If 'float'<= INT_MAX is true, then why (int)'float' may trigger undefined behavior?

Question

Sample code (t0.c):

#include <stdio.h>
#include <limits.h>

#define F 2147483600.0f

int main(void)
{
        printf("F            %f\n", F);
        printf("INT_MAX      %d\n", INT_MAX);
        printf("F <= INT_MAX %d\n", F <= INT_MAX);
        if      ( F <= INT_MAX )
        {
                printf("(int)F       %d\n", (int)F);
        }
        return 0;
}

Invocations:

$ gcc t0.c && ./a.exe
F            2147483648.000000
INT_MAX      2147483647
F <= INT_MAX 1
(int)F       2147483647

$ clang t0.c && ./a.exe
F            2147483648.000000
INT_MAX      2147483647
F <= INT_MAX 1
(int)F       0

Questions:

If F is printed as 2147483648.000000, then why F <= INT_MAX is true?
What is the correct way to avoid UB here?

UPD. Solution:

if      ( lrintf(F) <= INT_MAX )
{
        printf("(int)F       %d\n", (int)F);
}

UPD2. Better solution:

if      ( F <= nextafter(((float)INT_MAX) + 1.0f, -INFINITY) )
{
        printf("(int)F       %d\n", (int)F);
}

Can you add `printf("INT_MAX as float %f\n", (float)INT_MAX);`? If it prints `2147483648`, you have the answer why the `if` is true. — mch, Aug 18 '21 at 12:12
Remember that the limited precision of type `float` (or any noninfinite floating-point type) can apply to the left of the decimal point as well as the right. There is no such `float` number as 2147483600. The nearest representable `float` value is 2147483648. — Steve Summit, Aug 18 '21 at 12:18
With `if ( F <= nextafter(((float)INT_MAX) + 1.0f, -INFINITY) )`, was `nextafterf()` intended, or are you suggesting a `double` function? IAC, your solutions do not belong in the question, but as a post of your own answer where they can be properly rated. — chux - Reinstate Monica, Aug 18 '21 at 18:43
`if ( lrintf(F) <= INT_MAX )` is the wrong when `(float) INT_MAX` is exact. e.g. 16-bit `int`, 64-bit `float`. — chux - Reinstate Monica, Aug 18 '21 at 19:17

dbush · Accepted Answer · 2021-08-18T12:28:23.577

You're comparing a value of type int with a value of type float. The operands of the <= operator need to first be converted to a common type to evaluate the comparison.

This falls under the usual arithmetic conversions. In this case, the value of type int is converted to type float. And because the value in question (2147483647) cannot be represented exactly as a float, it results in the closest representable value, in this case 2147483648. This matches what the constant represented by the macro F converts to, so the comparison is true.

Regarding the cast of F to type int, because the integer part of F is outside the range of an int, this triggers undefined behavior.

Section 6.3.1.4 of the C standard dictates how these conversions from integer to floating point and from floating point to integer are performed:

1 When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

2 When a value of integer type is converted to a real floating type, if the value being converted can be represented exactly in the new type, it is unchanged. If the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is either the nearest higher or nearest lower representable value, chosen in an implementation-defined manner. If the value being converted is outside the range of values that can be represented, the behavior is undefined. Results of some implicit conversions may be represented in greater range and precision than that required by the new type (see 6.3.1.8 and 6.8.6.4)

And section 6.3.1.8p1 dictates how the usual arithmetic conversions are performed:

First, if the corresponding real type of either operand is long double, the other operand is converted, without change of type domain, to a type whose corresponding real type is long double.

Otherwise, if the corresponding real type of either operand is double, the other operand is converted, without change of type domain, to a type whose corresponding real type is double.

Otherwise, if the corresponding real type of either operand is float, the other operand is converted, without change of type domain, to a type whose corresponding real type is float.

As for how to avoid undefined behavior in this case, if the constant F has no suffix i.e. 2147483600.0 then it has type double. This type can represent exactly any 32 bit integer value, so the given value is not rounded and can be stored in an int.

Thanks! Indeed, I forgot that in `int` is converted to `float` in `<=`. The solution is to use `lrintf()`: `if (lrintf(F) <= INT_MAX)`. — pmor, Aug 18 '21 at 13:26
dbush, using a `double` may nearly solve the `int/float` issue, (`F <= 2147483647.0` is the wrong test, should be `F < 2147483648.0` or still as `float` math `F < 2147483648.0f`) but begs the solution for a `long long/double` one. We can take advantage `some_integer_type_MAX` is a is a [mersine number](https://mathworld.wolfram.com/MersenneNumber.html) to form an [exact FP limit](https://stackoverflow.com/a/68837262/2410359) and then not need to rely on wider types. — chux - Reinstate Monica, Aug 18 '21 at 18:27
@pmor `lrintf(F) <= INT_MAX` is not the best as it is not true with values in the [INT_MAX + 0.5 .... INT_MAX + 1.0) range. — chux - Reinstate Monica, Aug 18 '21 at 19:09

score 4 · Answer 2 · answered Aug 18 '21 at 12:19

The root cause of your problem is the implicit conversion from the INT_MAX literal to a float value, when doing the F <= INT_MAX comparisons. The float data type simply does not have enough precision to properly store the 2147483647 value, and (it so happens), the value of 2147483648 is stored, instead^†.

The clang-cl compiler warns about this:

warning : implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]

And, you can confirm this yourself by adding the following line to your code:

printf("(float)IMAX  %f\n", (float)INT_MAX);

That line displays (float)IMAX 2147483648.000000 on my system (Windows 10, 64-bit, clang-cl in Visual Studio 2019).

† The actual value stored in the float in such cases is implementation-defined, as pointed out in the excellent answer by dbush.

chux - Reinstate Monica · Answer 3 · 2021-08-18T18:55:14.963

If F is printed as 2147483648.000000, then why F <= INT_MAX is true?

If 'float'<= INT_MAX is true, then why (int)'float' may trigger undefined behavior?

#define F 2147483600.0f ... if ( F <= INT_MAX ) is an insufficient test as it is imprecise. The conversion of INT_MAX to float typically suffers rounding.

What is the correct way to avoid UB here?

To test if a float is convertible to a int without a problem, first review the spec:

When a finite value of standard floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined. C17dr

This means floating point values like [-2,147,483,648.999... to 2,147,483,647.999...] are acceptable float - or with extended math: (INT_MIN - 1 to INT_MAX + 1). Note [] v. ().

There is no need for wider types.

Code needs to compare the range, as float, precisely.
(INT_MAX/2 + 1) * 2.0f is exact as INT_MAX is a Mersenne Number. ^*1

// Form INT_MAX_PLUS1 as a float
#define F_INT_MAX_PLUS1 ((INT_MAX/2 + 1) * 2.0f)

// With 2's complement, INT_MIN is exact as a float.

if (some_float < F_INT_MAX_PLUS1 && (some_float - INT_MIN) > -1.0f)) {
  int i = (int) some_float; // no problem
} else {
  puts("Conversion problem");
}

Tip: form the test like above to also catch some_float as not-a-number.

^*1 some_int_MAX may be a problem with UINT128_MAX or more due to limited float exponent range.

If 'float'<= INT_MAX is true, then why (int)'float' may trigger undefined behavior?

3 Answers3

Linked