Questions tagged [ieee-754]

IEEE 754 is the most common & widely used floating-point standard, notably the single-precision binary32 aka float and double-precision binary64 aka double formats.

IEEE 754 is the Institute of Electrical and Electronics Engineers standard for floating-point computation, and is the most common & widely used implementation thereof.

Wikipedia on IEEE 754 (2008)
ieee.org documentation
https://en.wikipedia.org/wiki/Single-precision_floating-point_format aka binary32, usually called float or real4. Nice diagrams of the bit-pattern, and range over which it can represent every integer exactly, and so on.
https://en.wikipedia.org/wiki/Double-precision_floating-point_format usually called double or real8
Algorithm to convert an IEEE 754 double to a string? including the recent Ryū: fast float-to-string conversion

As well as formats, IEEE754 also defines the basic operations, + - * / and sqrt, as producing correctly-rounded results (error <= 0.5ulp). Other functions like pow and sin are not required to be as accurate; that's an implementation choice between precision and performance.

This is why many CPU instruction sets only include the basic operations (including sqrt).

1447 questions

votes

2 answers

Can't get 0.30000000000000004 by calculating

When I run in the console 0.1 + 0.2 the result is 0.30000000000000004. So I tried to calculate it myself. Here are the steps I've taken. 1) Represent 0.1 as IEEE754 double: 0.1 = 0 01111111011 1001100110011001100110011001100110011001100110011010 2)…

javascript binary ieee-754

asked May 19 '16 at 15:10

Max Koretskyi

101,079
60
333
488

votes

2 answers

Why the IEEE-754 exponent bias used in this C code is 126.94269504 instead of 127?

The following C function is from fastapprox project. static inline float fasterlog2 (float x) { union { float f; uint32_t i; } vx = { x }; float y = vx.i; y *= 1.1920928955078125e-7f; return y - 126.94269504f; } Could some experts here…

c math floating-point ieee-754

asked Jun 26 '15 at 20:30

Astaroth

2,241
18
35

votes

7 answers

Do-s and Don't-s for floating point arithmetic?

What are some good do-s and don't-s for floating point arithmetic (IEEE754 in case there's confusion) to ensure good numerical stability and high accuracy in your results? I know a few like don't subtract quantities of similar magnitude, but I'm…

c floating-point ieee-754

asked Jun 23 '10 at 14:35

gct

14,100
15
68
107

votes

4 answers

CLR JIT optimizations violates causality?

I was writing an instructive example for a colleague to show him why testing floats for equality is often a bad idea. The example I went with was adding .1 ten times, and comparing against 1.0 (the one I was shown in my introductory numerical…

.net floating-point clr equality ieee-754

asked Feb 08 '10 at 22:52

Gobiner

votes

2 answers

flush-to-zero behavior in floating-point arithmetic

While, as far as I remember, IEEE 754 says nothing about a flush-to-zero mode to handle denormalized numbers faster, some architectures offer this mode (e.g. http://docs.sun.com/source/806-3568/ncg_lib.html ). In the particular case of this…

c embedded floating-point ieee-754

asked Jan 18 '10 at 02:14

Pascal Cuoq

79,187
7
161
281

votes

2 answers

Is SSE floating-point arithmetic reproducible?

The x87 FPU is notable for using an internal 80-bit precision mode, which often leads to unexpected and unreproducible results across compilers and machines. In my search for reproducible floating-point math on .NET, I discovered that both major…

.net floating-point sse ieee-754 x87

asked Feb 28 '13 at 22:49

Asik

21,506
6
72
131

votes

4 answers

Incorrect floating point behavior

When I run the below C++ program in a 32-bit powerpc kernel which supports software floating emulation (hardware floating point disabled), I get a incorrect conditional evaluation. Can some tell me what's the potential problem here? #include…

c++ floating-point embedded-linux ieee-754 powerpc

asked Feb 23 '13 at 20:34

rajachan

votes

2 answers

Math.pow with negative numbers and non-integer powers

The ECMAScript specification for Math.pow has the following peculiar rule: If x < 0 and x is finite and y is finite and y is not an integer, the result is NaN. (http://es5.github.com/#x15.8.2.13) As a result Math.pow(-8, 1 / 3) gives NaN rather…

javascript floating-point ieee-754 ecmascript-5 ecma262

asked Jan 29 '13 at 04:32

Nathan Wall

10,530
4
24
47

votes

1 answer

Why is pow(-infinity, positive non-integer) +infinity?

C99 annex F (IEEE floating point support) says this: pow(−∞, y) returns +∞ for y > 0 and not an odd integer. But, say, (−∞)0.5 actually has the imaginary values ±∞i, not +∞. C99’s own sqrt(−∞) returns a NaN and generates a domain error as…

c math floating-point standards ieee-754

asked Apr 28 '12 at 19:51

Chortos-2

votes

5 answers

How many distinct floating-point numbers in a specific range?

How many representable floats are there between 0.0 and 0.5? And how many representable floats are there between 0.5 and 1.0? I'm more interested in the math behind it, and I need the answer for floats and doubles.

language-agnostic ieee-754

asked Jan 16 '12 at 01:56

Arlen

6,641
4
29
61

votes

1 answer

subnormal IEEE 754 floating point numbers support on iOS ARM devices (iPhone 4)

While porting an application from Linux x86 to iOS ARM (iPhone 4), I've discovered a difference in behavior on floating point arithmetics and small values. 64bits floating point numbers (double) smaller than [+/-]2.2250738585072014E-308 are called…

c ios floating-point arm ieee-754

asked Sep 08 '11 at 10:24

Yann Droneaud

5,277
1
23
39

votes

4 answers

Are JS engines allowed to change the bits of a NaN?

In JavaScript, the NaN value can be represented by a wide range of 64-bit doubles internally. Specifically, any double with the following bitwise representation: x111 1111 1111 xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx Is…

javascript ieee-754

asked Nov 07 '19 at 20:55

MaiaVictor

51,090
44
144
286

votes

6 answers

IEEE-754 compliant round-half-to-even

The C standard library provides the round, lround, and llround family of functions in C99. However, these functions are not IEEE-754 compliant, because they do not implement the "banker's rounding" of half-to-even as mandated by IEEE. Half-to-even…

c floating-point rounding ieee-754 bankers-rounding

asked Sep 23 '15 at 18:05

68ejxfcj5669

votes

3 answers

What is the result of comparing a number with NaN?

Consider for example bool fun (double a, double b) { return a < b; } What will fun return if any of the arguments are NaN? Is this undefined / implementation defined behavior? What happens with the other relational operators and the equality…

c++ floating-point nan ieee-754

asked Jul 04 '15 at 21:32

Baum mit Augen

49,044
25
144
182

votes

4 answers

What is (+0)+(-0) by IEEE floating point standard?

Am I right that any arithmetic operation on any floating numbers is unambiguously defined by IEEE floating point standard? If yes, just for curiosity, what is (+0)+(-0)? And is there a way to check such things in practice, in C++ or other commonly…

c++ floating-point ieee-754 negative-zero

asked Mar 09 '15 at 19:00

se0808

Prev 1 2 3

…

96 97 Next