Questions tagged [floating-point]

Floating point numbers are approximations of real numbers that can represent larger ranges than integers but use the same amount of memory, at the cost of lower precision. If your question is about small arithmetic errors (e.g. why does 0.1 + 0.2 equal 0.300000001?) or decimal conversion errors, please read the tag page before posting.

Many questions asked here about floating point math are about small inaccuracies in floating point arithmetic. To use the example from the excerpt, 0.1 + 0.2 might result in 0.300000001 instead of the expected 0.3. Errors like these are caused by the way floating point numbers are represented in computers' memory.

Integers are stored as exact values of the numbers they represent. Floating point numbers are stored as two values: a significand and an exponent. It is not possible to find a significand-exponent pair that matches every possible real number. As a result, some approximation and therefore inaccuracy is unavoidable.

Two commonly cited introductory-level resources about floating point math are What Every Computer Scientist Should Know About Floating-Point Arithmetic and the floating-point-gui.de.

FAQs:

Why 0.1 does not exist in floating point

Floating Point Math at https://0.30000000000000004.com/

Related tags:

Programming languages where all numbers are double-precision (64b) floats:

15006 questions
9
votes
2 answers

Why do these two float64s have different values?

Consider these two cases: fmt.Println(912 * 0.01) fmt.Println(float64(912) * 0.01) (Go Playground link) The second one prints 9.120000000000001, which is actually fine, I understand why that is happening. However, why does the first line print…
Attila O.
  • 15,659
  • 11
  • 54
  • 84
9
votes
3 answers

How do I find the largest integer less than x?

If x is 2.3, then math.floor(x) returns 2.0, the largest integer smaller than or equal to x (as a float.) How would I get i the largest integer strictly smaller than x (as a integer)? The best I came up with is: i = int(math.ceil(x)-1) Is there a…
pheon
  • 2,867
  • 3
  • 26
  • 33
9
votes
1 answer

Go atomic.AddFloat32()

I need a function to atomically add float32 values in Go. This is what came up with based on some C code I found: package atomic import ( "sync/atomic" "unsafe" "math" ) func AddFloat32(addr *float32, delta float32) (new float32) { …
B_old
  • 1,141
  • 3
  • 12
  • 26
9
votes
3 answers

How can I add floats together in different orders, and always get the same total?

Let's say I have three 32-bit floating point values, a, b, and c, such that (a + b) + c != a + (b + c). Is there a summation algorithm, perhaps similar to Kahan summation, that guarantees that these values can be summed in any order and always…
splicer
  • 5,344
  • 4
  • 42
  • 47
9
votes
1 answer

Strange behavior of program in GNU C++, using floating-point numbers

Look at this program: #include #include using namespace std; typedef pair coords; double dist(coords a, coords b) { return sqrt((a.first - b.first) * (a.first - b.first) + (a.second - b.second) *…
9
votes
3 answers

How are upper and lower bounds for floating point numbers determined?

I have a question about the quote below (N3797, 3.9.1/8): The value representation of floating-point types is implementation-defined. As far as I understand it gives the implementation complete freedom in defining boundaries of floating point…
user2953119
9
votes
4 answers

How quickly check whether double fits in float? (Java)

Are there some arithmetic or bitwise operations that can check whether a double fits into a float without loss of precision. It should not only check that the double range is in the float range, but also that no mantissa bits get lost. Bye P.S.:…
user502187
9
votes
1 answer

Do gcc's __float128 floating point numbers take the current rounding mode into account?

Do the arithmetic operations on gcc's __float128 floating point numbers take the current rounding mode into account? For instance, if using the C++11 function std::fesetenv, I change the rounding mode to FE_DOWNWARD, will results of arithmetic…
9
votes
5 answers

Swift extract an Int, Float or Double value from a String (type-conversion)

Please could you help me here? I need to understand how to convert a String into an Int, Float or Double! This problem occurs when I'm trying to get the value from an UITextField and need this type of conversion! I used to do it like this: var…
365Cases
  • 93
  • 1
  • 1
  • 3
9
votes
3 answers

How to read in one character at a time from a file in python?

I want to read in a list of numbers from a file as chars one char at a time to check what that char is, whether it is a digit, a period, a + or -, an e or E, or some other char...and then perform whatever operation I want based on that. How can I do…
Harley Jones
  • 167
  • 6
  • 20
9
votes
2 answers

How to make InvariantCulture recognize a comma as a decimal separator?

How do I parse 1,2 with Single.Parse? The reason of asking is because, when I am using CultureInfo.InvariantCulture I don't get 1.2 as I would like, but rather 12. Shouldn't "Invariant Culture" ignore the culture? Consider the following…
default
  • 11,485
  • 9
  • 66
  • 102
9
votes
2 answers

Why is there int but not float in Go?

In Go, there's the type int which may be equivalent to int32 or int64 depending on the system architecture. I can declare an integer variable without worrying about its size with: var x int Why isn't there the type float, which would be equivalent…
cd1
  • 15,908
  • 12
  • 46
  • 47
9
votes
1 answer

Could not find an overload for '*' that accepts the supplied argument

I have converted a String to an Int by by using toInt(). I then tried multiplying it by 0.01, but I get an error that says Could not find an overload for '*' that accepts the supplied argument. Here is my code: var str: Int = 0 var pennyCount =…
9
votes
1 answer

Will different math CPUs yield the same floating point results?

I'm developing on OS portable software that has unit tests that must work on Linux, UNIX, and Windows. Imagine this unit test that asserts that the IEEE single-precision floating point value 1.26743237e+015f is converted to a string: void…
9
votes
6 answers

Convert float to string without sprintf()

I'm coding for a microcontroller-based application and I need to convert a float to a character string, but I do not need the heavy overhead associated with sprintf(). Is there any eloquent way to do this? I don't need too much. I only need 2 digits…
audiFanatic
  • 2,296
  • 8
  • 40
  • 56
1 2 3
99
100