How to alter a float by its smallest increment (or close to it)?

Question

I have a double value f and would like a way to nudge it very slightly larger (or smaller) to get a new value that will be as close as possible to the original but still strictly greater than (or less than) the original.

It doesn't have to be close down to the last bit—it's more important that whatever change I make is guaranteed to produce a different value and not round back to the original.

A double or a float? Depending on which you have, the smallest value will be different. — perimosocordiae, Sep 30 '08 at 22:34
Yeah, I realize my question title and description were inconsistent. I figured the answers could address both cases, which the accepted answer does. — Owen, Sep 30 '08 at 22:48

score 66 · Accepted Answer · edited Aug 27 '13 at 13:45

66

Check your math.h file. If you're lucky you have the nextafter and nextafterf functions defined. They do exactly what you want in a portable and platform independent way and are part of the C99 standard.

Another way to do it (could be a fallback solution) is to decompose your float into the mantissa and exponent part. Incrementing is easy: Just add one to the mantissa. If you get an overflow you have to handle this by incrementing your exponent. Decrementing works the same way.

EDIT: As pointed out in the comments it is sufficient to just increment the float in it's binary representation. The mantissa-overflow will increment the exponent, and that's exactly what we want.

That's in a nutshell the same thing that nextafter does.

This won't be completely portable though. You would have to deal with endianess and the fact that not all machines do have IEEE floats (ok - the last reason is more academic).

Also handling NAN's and infinites can be a bit tricky. You cannot simply increment them as they are by definition not numbers.

edited Aug 27 '13 at 13:45

answered Sep 30 '08 at 22:34

Nils Pipenbrinck

83,631
31
151
221

9

You specifically do NOT want to handle mantissa overflow, since the overflow will roll over onto the exponent which is what you want. – Sep 30 '08 at 22:40
1

Cool - I never thought about that. Incrementing the float as an integer will exactly do what needed. – Nils Pipenbrinck Sep 30 '08 at 22:47
2

It is cool :) Now could the idiot who downvoted my answer saying so please undo it? – Sep 30 '08 at 22:49
2

How would that work if the exponent increment overflowed into the sign bit? – Greg Rogers Sep 30 '08 at 22:51
1

Yep, you would need special cases for inf and nan. – Sep 30 '08 at 22:53
2

I think negative values have to be treated different as well. If you increment those the result will be more negative than the original value. And btw - I just disassembled nextafter in the VS.NET 2008 implementation. They do quite a bit more than I would have expected. – Nils Pipenbrinck Sep 30 '08 at 22:57
Nils: Ah, v. true about -ve numbers. – Sep 30 '08 at 23:08
1

You need extra care around 0 and -0 too – David Heffernan Mar 13 '15 at 20:52
1

In Visual C++ 2008 I found `_nextafter()` in `float.h`. – foraidt May 21 '15 at 16:21
[GNU libc source for `nextafterf`](https://www.sourceware.org/git/gitweb.cgi?p=glibc.git;a=blob;f=sysdeps/ieee754/flt-32/s_nextafterf.c), and [`nextafter` (using only 32bit integers, so it's clunky)](https://www.sourceware.org/git/gitweb.cgi?p=glibc.git;a=blob;f=sysdeps/x86_64/strcpy.S;hb=1233be76694ca81454f61e2ba5a2fb5830840191). These functions handle the +/-0.0 special case, negative floats, and going towards a given 2nd arg, not always +Inf. Remember, that's LGPLed code. Even though it's linked from SO, you can only copy-paste it into GPL-compatible projects. – Peter Cordes Feb 15 '16 at 20:38

score 23 · Answer 2 · 2008-09-30T23:10:07.343

23

u64 &x = *(u64*)(&f);
x++;

Yes, seriously.

Edit: As someone pointed out, this does not deal with -ve numbers, Inf, Nan or overflow properly. A safer version of the above is

u64 &x = *(u64*)(&f);
if( ((x>>52) & 2047) != 2047 )    //if exponent is all 1's then f is a nan or inf.
{
    x += f>0 ? 1 : -1;
}

edited Sep 30 '08 at 23:10

answered Sep 30 '08 at 22:38

1

I wonder if the downvoter could comment as to why this wasn't helpful... Myself, having learned of the nextafter() function, I'd prefer those but if this one would work then I figure it's noteworthy in its own right. – Owen Sep 30 '08 at 22:47
Mike, what's the include file / compiler that will make this run? – David Nehme Sep 30 '08 at 22:55
This is totally implementation dependent and non-portable. Works ok if you have EEMMM, but if you have MMMEE won't give you the result you want. – Benoit Sep 30 '08 at 22:58
@David: Try replacing u64 with 'long long'. @Benoit: yep, it assumes ieee754, I think that's a fairly safe bet nowadays. – Sep 30 '08 at 23:06
2

undefined behaviour, violation of strict aliasing rules – sellibitze Oct 26 '09 at 16:36
@sellibitze: possibly undefined but in practice C compilers that do not handle this kind of type-casting will not be able to build real-world code. So it isn't a problem unless you get over happy with optimization settings. – Zan Lynx Jan 07 '10 at 00:06
can't you just cast it through a void pointer `*(u64*)(void*)(&f)` you are then telling the compiler you know about the aliasing issues and you're OK with it? – Matt Clarkson Dec 12 '11 at 14:41
+1 If doing a x86-64 build, the float would be stored in a SSE register. The fastest way to do this would then be: `_mm_cvtss_f32(_mm_castsi128_ps(_mm_add_epi32(_mm_castps_si128(_mm_set_ss(val)), _mm_set1_epi32(1))))`. Note that cast/cvt/set_ss will not generate any instructions. – rasmus Jan 30 '13 at 00:48
1

This doesn't handle -0 either. – David Heffernan Mar 13 '15 at 20:53
It's not quite *this* simple. See [glibc's implementation of `nextafterf`](https://www.sourceware.org/git/gitweb.cgi?p=glibc.git;a=blob;f=sysdeps/ieee754/flt-32/s_nextafterf.c). Note how you have to decrease the magnitude when `x` is negative and non-zero. (And note that integer compares of FP numbers compare them as 2's complement, so you get the opposite result with two negative nubmers). (The [implementation of `double nextafter`](https://www.sourceware.org/git/gitweb.cgi?p=glibc.git;a=blob;f=math/s_nextafter.c) is the same, but way more clunky because it only uses 32bit integers.) – Peter Cordes Feb 15 '16 at 15:29

score 6 · Answer 3 · answered Sep 30 '08 at 22:38

In absolute terms, the smallest amount you can add to a floating point value to make a new distinct value will depend on the current magnitude of the value; it will be the type's machine epsilon multiplied by the current exponent.

Check out the IEEE spec for floating point represenation. The simplest way would be to reinterpret the value as an integer type, add 1, then check (if you care) that you haven't flipped the sign or generated a NaN by examining the sign and exponent bits.

Alternatively, you could use frexp to obtain the current mantissa and exponent, and hence calculate a value to add.

score 2 · Answer 4 · answered Sep 30 '08 at 22:47

I needed to do the exact same thing and came up with this code:

double DoubleIncrement(double value)
{
  int exponent;
  double mantissa = frexp(value, &exponent);
  if(mantissa == 0)
    return DBL_MIN;

  mantissa += DBL_EPSILON/2.0f;
  value = ldexp(mantissa, exponent);
  return value;
}

score 0 · Answer 5 · answered Sep 30 '08 at 23:01

0

For what it's worth, the value for which standard ++ incrementing ceases to function is 9,007,199,254,740,992.

answered Sep 30 '08 at 23:01

TheSoftwareJedi

34,421
21
109
151

score 0 · Answer 6 · answered Oct 01 '08 at 02:36

This may not be exactly what you want, but you still might find numeric_limits in of use. Particularly the members min(), and epsilon().

I don't believe that something like mydouble + numeric_limits::epsilon() will do what you want, unless mydouble is already close to epsilon. If it is, then you're in luck.

score -2 · Answer 7 · edited Oct 01 '08 at 00:56

I found this code a while back, maybe it will help you determine the smallest you can push it up by then just increment it by that value. Unfortunately i can't remember the reference for this code:

#include <stdio.h>

int main()
{
    /* two numbers to work with */
    double number1, number2;    // result of calculation
    double result;
    int counter;        // loop counter and accuracy check

    number1 = 1.0;
    number2 = 1.0;
    counter = 0;

    while (number1 + number2 != number1) {
        ++counter;
        number2 = number2 / 10;
    }
    printf("%2d digits accuracy in calculations\n", counter);

    number2 = 1.0;
    counter = 0;

    while (1) {
        result = number1 + number2;
        if (result == number1)
            break;
        ++counter;
        number2 = number2 / 10.0;
    }

    printf("%2d digits accuracy in storage\n", counter );

    return (0);
}

How to alter a float by its smallest increment (or close to it)?

7 Answers7

Linked

Related