Questions tagged [ieee-754]

IEEE 754 is the most common & widely used floating-point standard, notably the single-precision binary32 aka float and double-precision binary64 aka double formats.

IEEE 754 is the Institute of Electrical and Electronics Engineers standard for floating-point computation, and is the most common & widely used implementation thereof.

Wikipedia on IEEE 754 (2008)
ieee.org documentation
https://en.wikipedia.org/wiki/Single-precision_floating-point_format aka binary32, usually called float or real4. Nice diagrams of the bit-pattern, and range over which it can represent every integer exactly, and so on.
https://en.wikipedia.org/wiki/Double-precision_floating-point_format usually called double or real8
Algorithm to convert an IEEE 754 double to a string? including the recent Ryū: fast float-to-string conversion

As well as formats, IEEE754 also defines the basic operations, + - * / and sqrt, as producing correctly-rounded results (error <= 0.5ulp). Other functions like pow and sin are not required to be as accurate; that's an implementation choice between precision and performance.

This is why many CPU instruction sets only include the basic operations (including sqrt).

1447 questions

votes

4 answers

Behaviour of negative zero (-0.0) in comparison with positive zero (+0.0)

In my code, float f = -0.0; // Negative and compared with negative zero f == -0.0f result will be true. But float f = 0.0; // Positive and compared with negative zero f == -0.0f also, result will be true instead of false Why in both cases…

c++ floating-point ieee-754

asked Aug 21 '17 at 10:58

Jayesh

4,755
9
32
62

votes

3 answers

Detecting a negative 0 stored as a double in C++

I am doing some mathematical calculations (trying to convert Matlab code into C++, using VS2010) and I need to be able to tell if at some point I get a negative 0. According to the IEEE standard -0/+0 differ only in the sign bit (the rest are 0).…

c++ double ieee-754

asked Apr 01 '11 at 17:23

vicky5G

votes

1 answer

IEEE double such that sqrt(x*x) ≠ x

Does there exist an IEEE double x>0 such that sqrt(x*x) ≠ x, under the condition that the computation x*x does not overflow or underflow to Inf, 0, or a denormal number? This is given that sqrt returns the nearest representable result, and so does…

floating-point double ieee-754 square-root

asked Jan 15 '17 at 00:11

Fengyang Wang

11,901
2
38
67

votes

6 answers

Can -ffast-math be safely used on a typical project?

While answering a question where I suggested -ffast-math, a comment pointed out that it is dangerous. My personal feeling is that outside scientific calculations, it is OK. I also asume that serious financial applications use fixed point instead of…

c++ c optimization floating-point ieee-754

asked Aug 16 '16 at 15:32

bolov

72,283
15
145
224

votes

9 answers

Why does this loop never end?

Possible Duplicate: problem in comparing double values in C# I've read it elsewhere, but really forget the answer so I ask here again. This loop seems never end regardless you code it in any language (I test it in C#, C++, Java...): double d =…

floating-point ieee-754

asked Jul 28 '10 at 08:34

Truong Ha

10,468
11
40
45

votes

1 answer

Denormalized Numbers - IEEE 754 Floating Point

So I'm trying to learn more about Denormalized numbers as defined in the IEEE 754 standard for Floating Point numbers. I've already read several articles thanks to Google search results, and I've gone through several StackOverFlow posts. However I…

performance floating-point standards ieee-754 subnormal-numbers

asked Feb 28 '13 at 16:38

dtmland

2,136
4
22
45

votes

6 answers

How to check if float can be exactly represented as an integer

I'm looking to for a reasonably efficient way of determining if a floating point value (double) can be exactly represented by an integer data type (long, 64 bit). My initial thought was to check the exponent to see if it was 0 (or more precisely…

c double ieee-754

asked Jan 18 '12 at 04:36

ircmaxell

163,128
34
264
314

votes

2 answers

Invertability of IEEE 754 floating-point division

What is the invertability of the IEEE 754 floating-point division? I mean is it guaranteed by the standard that if double y = 1.0 / x then x == 1.0 / y, i.e. x can be restored precisely bit by bit? The cases when y is infinity or NaN are obvious…

c++ floating-point precision floating-accuracy ieee-754

asked Aug 01 '16 at 22:44

plasmacel

8,183
7
53
101

votes

2 answers

Cases where float == and != are not direct opposites

In https://github.com/numpy/numpy/issues/6428, the root cause for the bug seems to be that at simd.inc.src:543, a compiler optimizes !(tmp == 0.) to tmp != 0.. A comment says that these are "not quite the same thing." But doesn't specify any…

c ieee-754 comparison-operators

asked Jul 08 '16 at 15:39

ivan_pozdeev

33,874
19
107
152

votes

3 answers

What uses do floating point NaN payloads have?

I know that IEEE 754 defines NaNs to have the following bitwise representation: The sign bit can be either 0 or 1 The exponent field contains all 1 bits Some bits of the mantissa are used to specify whether it's a quiet NaN or signalling NaN The…

floating-point nan ieee-754

asked Nov 28 '15 at 04:47

Jordan Melo

1,193
7
26

votes

4 answers

Is there an open-source c/c++ implementation of IEEE-754 operations?

I am looking for a reference implementation of IEEE-754 operations. Is there such a thing?

c++ open-source floating-point ieee-754

asked Feb 02 '10 at 19:01

zr.

7,528
11
50
84

votes

5 answers

Why do I need 17 significant digits (and not 16) to represent a double?

Can someone give me an example of a floating point number (double precision), that needs more than 16 significant decimal digits to represent it? I have found in this thread that sometimes you need up to 17 digits, but I am not able to find an…

c floating-point precision ieee-754 floating-accuracy

asked May 25 '11 at 00:11

Ondřej Čertík

votes

12 answers

Why are c/c++ floating point types so oddly named?

C++ offers three floating point types: float, double, and long double. I infrequently use floating-point in my code, but when I do, I'm always caught out by warnings on innocuous lines like float PiForSquares = 4.0; The problem is that the literal…

c++ c floating-point history ieee-754

asked Dec 29 '08 at 23:47

Roddy

66,617
42
165
277

votes

8 answers

Convert ieee 754 float to hex with c - printf

Ideally the following code would take a float in IEEE 754 representation and convert it into hexadecimal void convert() //gets the float input from user and turns it into hexadecimal { float f; printf("Enter float: "); scanf("%f",…

c floating-point ieee-754

asked May 31 '10 at 02:27

Michael

votes

2 answers

Best non-trigonometric floating point approximation of tanh(x) in 10 instructions or less

Description I need a reasonably accurate fast hyperbolic tangent for a machine that has no built-in floating point trigonometry, so e.g. the usual tanh(x) = (exp(2x) - 1) / (exp(2x) + 1) formula is going to need an approximation of exp(2x). All…

algorithm math floating-point ieee-754 approximation

asked Sep 19 '22 at 08:58

hidefromkgb

5,834
1
13
44

Prev 1 2 3

…

96 97 Next