3

I was wondering what largest odd integer that can be represented exactly as a float? And why there is a difference between the largest even integer represented as a float in this case.

I believe it would have to do with the base 2 exponents 2^n-1, however I am not familiar enough with data representation in C to see the distinction.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • [Here's an explanation of the floating point format](https://en.wikipedia.org/wiki/Single-precision_floating-point_format#IEEE_754_single-precision_binary_floating-point_format:_binary32) that's commonly used. The binary representation of an odd integer has a 1 as the LSB. Because the fractional portion of the float is shifted left by the exponent, there comes a point where the LSB of the corresponding integer will always be 0. Any number past that point has to be even. – user3386109 Sep 11 '18 at 01:00
  • @user3386109 I'm a little confused, how is the fractional portion of the float shifted to the left by the exponent? – RobWantsToLearn Sep 11 '18 at 01:14
  • The largest floating point number is, in binary, 1.111111 x 2^e, where e is the largest exponent. This number will look like 1111111000000000000 (although with significantly more 1's and 0's), so it's obviously even. The largest float that is odd, however, will be 1.111111 x 2^x, where x is the maximum number of significant bits in the floating-point format. This works out to the number 1111111, which is obviously odd, and obviously much smaller. – Steve Summit Sep 11 '18 at 01:14
  • @SteveSummit The question is asking for the largest **integer** that can represented exactly as a float. Both even and odd – RobWantsToLearn Sep 11 '18 at 01:15
  • 1
    @RobWantsToLearn Yes, I understand. Both 1111111000000000000 and 1111111 in my example are integers. – Steve Summit Sep 11 '18 at 01:16
  • @RobWantsToLearn Do you understand how floating-point formats work? Do you understand that, in decimal, 1.23 x 10^2 is 123? – Steve Summit Sep 11 '18 at 01:17
  • @RobWantsToLearn Multiplying by 2 is the same as shifting 1 bit to the left. Increasing the exponent by 1 is the same as multiplying by 2, which is the same as shifting 1 bit to the left. – user3386109 Sep 11 '18 at 01:18
  • Going back to what I said earlier, the largest floating point number is 1.111111 x 2^e, which will look like 1111111000000000000, which is obviously an integer, and obviously even. Similarly, 1.111111 x 2^x is 1111111 which is an integer and odd. So those are the largest integers, even and odd, that can be represented exactly as a floating-point number. (I used seven 1's and twelve 0's. For IEEE-754 single-precision floating-point, you'd actually have 24 1's and about 128-23 = 105 0's.) – Steve Summit Sep 11 '18 at 01:20
  • @SteveSummit Okay, I believe I understand what you are saying. So in decimal form 1111111 is 127 (the smallest integer that can represented as a float) and a very large even integer, 1.11111111 x 2^e as the largest. How would this change if we were in the context of the largest even and odd integers that can be represented as a **double**? Since doubles and floats have the same # of bits?? – RobWantsToLearn Sep 11 '18 at 01:36
  • Who says doubles and floats have the same number of bits? For floats, you have 24 bits of precision and an exponent of +-127. For double, you have 53 bits of precision and an exponent of +- 1023. – Steve Summit Sep 11 '18 at 01:37
  • @SteveSummit That is much different. Okay, so how would one go about calculating the largest even and odd integers as doubles then? – RobWantsToLearn Sep 11 '18 at 01:53
  • "largest even integer represented as a float" is certainly `FLT_MAX`. – chux - Reinstate Monica Sep 11 '18 at 01:56
  • 1
    @SteveSummit: Re ”The largest float that is odd, however, will be 1.111111 x 2^x, where x is the maximum number of significant bits in the floating-point format.” That should be “x is one less than the maximum…” – Eric Postpischil Sep 11 '18 at 02:05

2 Answers2

8

For IEEE-754 basic 32-bit binary floating-point, the largest representable odd integer is 224−1.

For IEEE-754 basic 64-bit binary floating-point, the largest representable odd integer is 253−1.

This is due to the fact that the formats have 24-bit and 53-bit significands. (The significand is the fraction part of a floating-point number.)

The values represented by the bits in the significand are scaled according to the exponent of the floating-point number. In order to represent an odd number, the floating-point number must have a bit in the significand that represents 20. With a 24-bit significand, if the lowest bit represents 20, then the highest bit represents 223. The largest value is obtained when all the bits are on, which makes the value 20 + 21 + 22 + … 223, which equals 224−1.

More generally, the largest representable odd integer is normally scalbnf(1, FLT_MANT_DIG) - 1. This can also be computed as (2 - FLT_EPSILON) / FLT_EPSILON. (This assumes a normal case in which FLT_RADIX is even and FLT_MANT_DIG <= FLT_MAX_EXP. Note that if FLT_MANT_DIG == FLT_MAX_EXP, the latter expression, with FLT_EPSILON, should be used, because the former overflows.)

The abnormal cases, just for completeness:

  • If FLT_RADIX is odd and FLT_MANT_DIG <= FLT_MAX_EXP, the largest representable odd integer is FLT_MAX if FLT_MANT_DIG is odd and FLT_MAX - scalbnf(FLT_EPSILON, FLT_MAX_EXP+1) otherwise.
  • If FLT_RADIX is even and FLT_MANT_DIG > FLT_MAX_EXP, then: If FLT_MAX_EXP > 0, the largest representable odd integer is floorf(FLT_MAX). Otherwise, no odd integers are representable.
  • If FLT_RADIX is odd and FLT_MANT_DIG > FLT_MAX_EXP, then: If FLT_MAX_EXP > 0, the largest representable odd integer is floorf(FLT_MAX) if FLT_MANT_DIG - FLT_MAX_EXP is odd or floorf(FLT_MAX)-1 otherwise. Otherwise, no odd integers are representable.
Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • This makes sense to me how you calculated it. and so then the largest even integer would be float max as commented above by Chux? – RobWantsToLearn Sep 11 '18 at 02:06
  • And then if I were to instead say the largest re presentable odd integer as a double would I have to use the significands that doubles have in a similar format? – RobWantsToLearn Sep 11 '18 at 02:07
  • So largest odd integer represented as a **double** would be 2^(52) - 1? and largest even would be 2^1023? – RobWantsToLearn Sep 11 '18 at 02:20
  • @RobWantsToLearn: For IEEE-754 basic 64-bit binary format, the largest representable odd integer is 2^53−1, not 2^52−1. – Eric Postpischil Sep 11 '18 at 02:27
  • @RobWantsToLearn: To adjust the above for `double`, change `FLT` to `DBL` and remove the `f` suffix from the function names. – Eric Postpischil Sep 11 '18 at 02:27
  • @RobWantsToLearn: If `FLT_RADIX` is even, the largest representable even integer is normally `FLT_MAX`. This is the floating-point value with all significand bits and the maximum exponent. 2^1023 has only one significand bit set, not all of them. For IEEE-754 64-bit, it would be 2^1024−2^971. (In the abnormal case, the exponent cannot be large enough to scale the significand to an integer. Such a format is not usually encountered in practice. Its largest even integer would be `floorf(FLT_MAX)`, which also works for the normal case.) – Eric Postpischil Sep 11 '18 at 02:29
3

The largest odd integer representable as a 32-bit float is 2^24 - 1.

More specifically, a 32-bit float can exactly represent the following integers:

  • All integers up to 2^24
  • All even integers up to 2^25
  • All multiples of 4 up to 2^26
  • All multiples of 8 up to 2^27
  • etc.

In other words, here are the integers exactly representable:

0
1
2
...
16777215
16777216 (= 2^24)
16777218
16777220
...
33554430
33554432 (= 2^25)
33554436
33554440
...
67108860
67108864 (= 2^26)
67108872
67108880
...

Note that this applies both for positive and negative integers, for example, all negative integers down to -2^24 can be exactly represented.

Below is some C++ code doing integers-to-float conversions for integers around 2^24, around 2^25, and around 2^26, to see this in practice.

Live on Coliru

#include <iostream>
#include <iomanip>
#include <vector>

int main()
{
    int _2pow24 = 1 << 24;
    int _2pow25 = 1 << 25;
    int _2pow26 = 1 << 26;
    std::cout << "2^24 = " << _2pow24 << std::endl;
    std::cout << "2^25 = " << _2pow25 << std::endl;
    std::cout << "2^26 = " << _2pow26 << std::endl;
    std::vector<int> v;
    for (int i = -4; i < 4; ++i) v.push_back(_2pow24 + i);
    for (int i = -8; i < 8; ++i) v.push_back(_2pow25 + i);
    for (int i = -16; i < 16; ++i) v.push_back(_2pow26 + i);
    for (int i : v) {
        std::cout << i << " -> "
                  << std::fixed << std::setprecision(1)
                  << static_cast<float>(i)
                  << std::endl;
    }
    return 0;
}

Output:

2^24 = 16777216
2^25 = 33554432
2^26 = 67108864
16777212 -> 16777212.0
16777213 -> 16777213.0
16777214 -> 16777214.0
16777215 -> 16777215.0
16777216 -> 16777216.0
16777217 -> 16777216.0
16777218 -> 16777218.0
16777219 -> 16777220.0
33554424 -> 33554424.0
33554425 -> 33554424.0
33554426 -> 33554426.0
33554427 -> 33554428.0
33554428 -> 33554428.0
33554429 -> 33554428.0
33554430 -> 33554430.0
33554431 -> 33554432.0
33554432 -> 33554432.0
33554433 -> 33554432.0
33554434 -> 33554432.0
33554435 -> 33554436.0
33554436 -> 33554436.0
33554437 -> 33554436.0
33554438 -> 33554440.0
33554439 -> 33554440.0
67108848 -> 67108848.0
67108849 -> 67108848.0
67108850 -> 67108848.0
67108851 -> 67108852.0
67108852 -> 67108852.0
67108853 -> 67108852.0
67108854 -> 67108856.0
67108855 -> 67108856.0
67108856 -> 67108856.0
67108857 -> 67108856.0
67108858 -> 67108856.0
67108859 -> 67108860.0
67108860 -> 67108860.0
67108861 -> 67108860.0
67108862 -> 67108864.0
67108863 -> 67108864.0
67108864 -> 67108864.0
67108865 -> 67108864.0
67108866 -> 67108864.0
67108867 -> 67108864.0
67108868 -> 67108864.0
67108869 -> 67108872.0
67108870 -> 67108872.0
67108871 -> 67108872.0
67108872 -> 67108872.0
67108873 -> 67108872.0
67108874 -> 67108872.0
67108875 -> 67108872.0
67108876 -> 67108880.0
67108877 -> 67108880.0
67108878 -> 67108880.0
67108879 -> 67108880.0
Boris Dalstein
  • 7,015
  • 4
  • 30
  • 59