Questions tagged [ieee-754]

IEEE 754 is the most common & widely used floating-point standard, notably the single-precision binary32 aka float and double-precision binary64 aka double formats.

IEEE 754 is the Institute of Electrical and Electronics Engineers standard for floating-point computation, and is the most common & widely used implementation thereof.

Wikipedia on IEEE 754 (2008)
ieee.org documentation
https://en.wikipedia.org/wiki/Single-precision_floating-point_format aka binary32, usually called float or real4. Nice diagrams of the bit-pattern, and range over which it can represent every integer exactly, and so on.
https://en.wikipedia.org/wiki/Double-precision_floating-point_format usually called double or real8
Algorithm to convert an IEEE 754 double to a string? including the recent Ryū: fast float-to-string conversion

As well as formats, IEEE754 also defines the basic operations, + - * / and sqrt, as producing correctly-rounded results (error <= 0.5ulp). Other functions like pow and sin are not required to be as accurate; that's an implementation choice between precision and performance.

This is why many CPU instruction sets only include the basic operations (including sqrt).

1447 questions

votes

4 answers

Explain a surprising parity in the rounding direction of apparent ties in the interval [0, 1]

Consider the collection of floating-point numbers of the form 0.xx5 between 0.0 and 1.0: [0.005, 0.015, 0.025, 0.035, ..., 0.985, 0.995] I can make a list of all 100 such numbers easily in Python: >>> values = [n/1000 for n in range(5, 1000,…

asked Jul 03 '20 at 18:50

Mark Dickinson

29,088
9
83
120

votes

1 answer

Probability that a formula fails in IEEE 754

On my computer, I can check that (0.1 + 0.2) + 0.3 == 0.1 + (0.2 + 0.3) evaluates to False. More generally, I can estimate that the formula (a + b) + c == a + (b + c) fails roughly 17% of the time when a,b,c are chosen uniformly and independently…

python numpy floating-point probability ieee-754

asked Aug 16 '19 at 02:58

hilberts_drinking_problem

11,322
3
22
51

votes

5 answers

Is there a Dart function to convert List to Double?

I'm getting a Bluetooth Characteristic from a Bluetooth controller with flutter blue as a List. This characteristic contains weight measurement of a Bluetooth scale. Is there a function to convert this list of ints to a Double? I tried to find some…

dart bluetooth ieee-754

asked Jul 31 '19 at 06:16

nvano

votes

3 answers

Floating point number in JavaScript (IEEE 754)

If I understand correctly, JavaScript numbers are always stored as double precision floating point numbers, following the international IEEE 754 standard. Which mean it uses 52 bits for fraction significand. But in the picture above, it seems like…

javascript ieee-754

asked Mar 21 '19 at 12:50

hungneox

9,333
12
49
66

votes

1 answer

What exactly does (1.0e300 + pow(2.0, -30.0) > 1.0) do in STDC?

I have come across a function which computes atan(x) (the source is here). Reducing it to the core of my question and slightly reformatting it, they have something like that: static const double one = 1.0, huge =…

c ieee-754 atan2

asked Feb 19 '19 at 18:51

Binarus

4,005
3
25
41

votes

4 answers

Convert from the IBM floating point to the IEEE floating point standard and vice versa in C#

I was looking for a way to convert IEEE floating point numbers to IBM floating point format for a old system we are using. Is there a general formula we can use in C# to this end?

c# c floating-point ieee-754

asked Dec 28 '10 at 23:28

aiw

votes

1 answer

Java/C: OpenJDK native tanh() implementation wrong?

I was digging through some of the Java Math functions native C source code. Especially tanh(), as I was curious to see how they implemented that one. However, what I found surprised me: double tanh(double x) { ... if (ix < 0x40360000) { …

java c math floating-point ieee-754

asked Jan 24 '17 at 00:50

Martijn Courteaux

67,591
47
198
287

votes

3 answers

Is there any way to see a number in it's 64 bit float IEEE754 representation

Javascript stores all numbers as double-precision 64-bit format IEEE 754 values according to the spec: The Number type has exactly 18437736874454810627 (that is, 264−253+3) values, representing the double-precision 64-bit format IEEE 754 values…

javascript binary ieee-754

asked May 07 '16 at 11:53

Max Koretskyi

101,079
60
333
488

votes

2 answers

For any finite floating point value, is it guaranteed that x - x == 0?

Floating point values are inexact, which is why we should rarely use strict numerical equality in comparisons. For example, in Java this prints false (as seen on ideone.com): System.out.println(.1 + .2 == .3); // false Usually the correct way to…

language-agnostic floating-point precision ieee-754 signedness

asked Aug 30 '10 at 10:35

polygenelubricants

376,812
128
561
623

votes

1 answer

Convert IEEE 754 to decimal floating point

I have what I think it is an IEEE754 with single or double precision (not sure) and I'd like to convert it to decimal on PHP. Given 4 hex value (which might be in little endian format, so basically reversed order) 4A,5B,1B,05 I need to convert it to…

php hex decimal ieee-754

asked Dec 28 '15 at 17:34

matt

2,312
5
34
57

votes

2 answers

Minimum and maximum of signed zero

I am concerned about the following cases min(-0.0,0.0) max(-0.0,0.0) minmag(-x,x) maxmag(-x,x) According to Wikipedia IEEE 754-2008 says in regards to min and max The min and max operations are defined but leave some leeway for the case where the…

c++ c floating-point sse ieee-754

asked Jun 18 '15 at 11:21

Z boson

32,619
11
123
226

votes

3 answers

Why are numbers with many significant digits handled differently in C# and JavaScript?

If JavaScript's Number and C#'s double are specified the same (IEEE 754), why are numbers with many significant digits handled differently? var x = (long)1234123412341234123.0; // 1234123412341234176 - C# var x = 1234123412341234123.0; //…

javascript c# floating-point ieee-754

asked May 27 '15 at 06:57

Hans Malherbe

2,988
24
19

votes

4 answers

What is the risk of numerical instabilities when predividing denominators?

Supposing I want to divide one number into many. a /= x; b /= x; c /= x; ... Since multiplication is faster, the temptation is to do this tmp = 1.0f / x; a *= tmp; b *= tmp; c *= tmp; ... 1) Is this guaranteed to produce identical answers? I…

c floating-point numeric ieee-754

asked Feb 02 '15 at 14:04

spraff

32,570
22
121
229

votes

1 answer

Why do we need IEEE 754 remainder?

I just read this topic (especially the last comments). Then I was wondering, why we actually need this was of giving the remainder. But it seems, that not many people "on google" were interested in that before...

floating-point modulo ieee-754

asked Oct 31 '14 at 10:04

TheTrowser

votes

5 answers

0.0 and -0.0 in Java (IEEE 754)

Java is totally compatible with IEEE 754 right? But I'm confused about how java decide the sign of float point addition and substraction. Here is my test result： double a = -1.5; double b = 0.0; double c = -0.0; System.out.println(b * a); …

java floating-point ieee-754

asked Jun 16 '14 at 07:11

MagicFingr

Prev 1 2 3

…

96 97 Next