Java Estimating Double's Representation Error

Question

Every now and then I see some rounding errors which are caused by floor'ing some values as shown in the two examples below.

// floor(number, precision)

double balance = floor(0.7/0.1, 3) // = 6.999 
double balance = floor(0.7*0.1, 3) // = 0.069

The problem of course is 0.7/0.1 and 0.7*0.1 are not exactly the number it should be due to representation errors [check the NOTE below].

One solution could be to add an epsilon so any representation error is mitigated just before applying the floor.

double balance = floor(0.7/0.1 + 1e-10, 3) // = 7.0
double balance = floor(0.7*0.1 + 1e-10, 3) // = 0.07

What epsilon should I use so it's guaranteed to work in all cases? I feel this solution is rather hacky unless I have a good strategy for choosing the correct epsilon which probably depends on the numbers I'm dealing with.

For instance, if there was a way of getting an estimation of the error (as in representation - number) or at least the sign of it (whether representation > number or not), that would be helpful to determine in what direction I should correct the result before applying the floor.

Any other workaround you can think of is very welcome.

NOTE: I know the real problem is I'm using doubles and it has representation errors. Please refrain from saying anything like I should store the balance in a long ((long) Math.floor(3931809L/0.080241D) is equally erratic). I also tried using BigDecimal but the performance degraded quite a lot (it's a realtime application). Also, note I'm not very concerned about propagating small errors over time, I do a lot of calculations like those above but I start from a fresh balance number every time (I do maybe 3 of those operations before returning and starting over).

EDIT: To make that clear, that's the only operation I do, and I repeat it 3 times on the same balance. For example, I take a balance in USD and I convert it to RUB, then to JPY then to EUR, and I return the balance and start over from the beginning (with a fresh balance number, ie no rounding error is propagated other than on these 3 operations). The values are not constrained apart from them being positive numbers (ie, in the range [0, +inf)) and the precision is always below 8 (8 decimal digits, ie 0.00000001 is the smallest balance I will ever have to deal with).

If all you are interested in is converting values back to integers, what's the problem with using [Math.round](https://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#round(double))? It will round up everything above 0.5 and down everything below 0.5 and this gives a plenty of margin for small errors. — SergGr, Jun 02 '18 at 23:24
@LAD, it might be the case that it is not only about converting to integers but for now each example shows casting to `long` and then division by `1000.0` which I assume is a constant in this context that represents required precision. — SergGr, Jun 02 '18 at 23:28
@SergGr Hm, yeah, that doesn't quite make sense if the variable was initialized as a double. — 0xCursor, Jun 02 '18 at 23:29
@SergGr I'm not converting values back to integers, why have you concluded so? The "(long)" only affects the numerator, then I have a "/ 1000.0" which converts it back to a double (ie, the cast is only used for truncating the number). That's a simplification, in reality I'm passing a precision which is converted to a multiplier (10^p) which is used in the floor, but that shouldn't matter. I hope it wasn't you who downvoted the question. Cheers :) — Mattx, Jun 02 '18 at 23:31
@Mattx, OK, I might misundersand your question but from your examples I don't see where I might get it wrong. I understand that you do your `/ 1000.0` (but I don't really see why you do it since for now this looks like a constant, so why don't store values as `long` just implying that this `/ 1000.0` is necessary when the value is formatted?) what I ask/suggest is replacing all casts to `long` with `Math.round` calls. What exactly would be wrong in that for your? It seems that might fix your issue unless you have very unusual data or some addiitional requirements. — SergGr, Jun 02 '18 at 23:34
Uhm, 2 downvotes for no reason. I'm editing the question so that part is clear, just in case. — Mattx, Jun 02 '18 at 23:36
@SergGr, as long as you're dealing with doubles I think you always have this problem. `Math.round(0.7/0.2)` equals 3 instead of 4. Maybe storing the balance in a long and using Math.round is the way to go, I'll check it in more examples and see, I'm not sure how the rounding is done internally. — Mattx, Jun 02 '18 at 23:46
@SergGr I think I'd be able to come up with an example in which `x=number*rate = 100.499999999999` for instance (x being a double, number being a long and rate being a double). In that case `(long) Math.round(x)` will be 100 instead of 101. What do you think? — Mattx, Jun 03 '18 at 00:01

Stephen C · Answer 1 · 2018-06-03T01:09:43.693

0

What epsilon should I use so it's guaranteed to work in all cases?

There is no epsilon that is guaranteed to work in all cases¹. Period.

If you analyze (mathematically²) the computations that your application is performing, then it may be possible to figure out a value of epsilon that will work for you.

But note that there are dangers in repeatedly "rounding off" the errors in a multi-step computation. You can end up with the wrong answer in the end. That's what the maths says.

Finally, ask yourself this: If it is legitimate / safe to just make the epsilon based adjustments, why (after 50 something years) do typical hand-held calculators still insist that 1.0 / 3 * 3 is 0.9999999999....

^{1 - Lets be clear. You have not attempted to specify what your "cases" are. So I am assuming that you mean all possible computations.}

^{2 - The analysis is complicated by the fact that the epsilon between a Real number and the corresponding floating binary representation (e.g. a "double") depends on the binary magnitude of the number.}

edited Jun 03 '18 at 01:09

answered Jun 03 '18 at 01:01

Stephen C

698,415
94
811
1,216

I know there's no silver bullet epsilon for all possible computations with doubles, but my problem is much more specific than a general case. In this context (ie, giventhe rounding function `floor`, the `balance`, the `rate` and the `precision`) there may be a way to find an epsilon that works just fine in all cases. I only have to do this operation only a few times on the same balance, I think the error introduced by the `floor` is far bigger than any rounding error introduced by the use of doubles. – Mattx Jun 03 '18 at 02:15
Your problem may be more specific, but you haven't told us what it is. Unless you can tell us clearly / precisely / accurately what it is, we can't do the analysis to determine if 1) your approach will work, and 2) what epsilon should be. – Stephen C Jun 03 '18 at 02:24
In what way it's not specific? Let me know what info I could add and I'll edit the question. – Mattx Jun 03 '18 at 02:26
*"I only have to do this operation only a few times on the same balance, I think the error introduced by the floor is far bigger than any rounding error introduced by the use of doubles."* - That is not sufficiently detailed. We need the precise sequence of operations (including what "a few times" means") AND the range of the numbers involved. This is PROPER MATHS we are talking about here not some *"I think the error ..."* stuff. *"I think"* is basically guesswork, – Stephen C Jun 03 '18 at 02:26
I have that information in the original question. The operation is always `balance = floor(balance * rate, precision)` and the number of times I apply it is usually 3 (may be 1 or 2 also but not more). In other words, you have a balance in let's say USD, and you convert it to RUB (one operation), then you get that balance in RUB and you convert it to JPY (second operation), then you get that balance in JPY and you convert it to EUR (third operation). For some currencies I need a big precision, I'm using 8 decimal digits. All other values are not constrained (except they're positive numbers). – Mattx Jun 03 '18 at 02:32
Put the information into the Question. (No ... it is not there ... yet.) Also, you have not stated the range. 8 decimal digits is a measure of (relative) precision, not range. – Stephen C Jun 03 '18 at 02:37
"I do maybe 3 of those operations before returning". That means I do three of those, and that's it. "8 decimal digits" and "they are positive numbers" sounds like it's 0.00000001 and above, anyway I'm going to edit the question so there's no doubt about it. Don't know why you're being so unfriendly really, all the info is there and what is not I'm pretty happy to add... – Mattx Jun 03 '18 at 02:43
*"Don't know why you're being so unfriendly really ..."* - It is because getting you to explain yourself clearly is a bit like pulling teeth without an anaesthetic. The best answer to this question is probably to get you to read and understand this article for yourself: http://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf – Stephen C Jun 03 '18 at 02:49
The info was already there, in other words. I haven't added anything new (apart from the precision being at most 8). I'll have a look at the paper but I don't think linking a 44 pages double column paper is "the best answer". If you have an idea please be more specific. I've edited the question, if you feel anything is missing let me know. – Mattx Jun 03 '18 at 02:53
1) The document gives the information that would allow you to work it out for yourself. 2) You still have not included any information about the **ranges** of the numbers you are working with. And (as the document explains) this is important. – Stephen C Jun 03 '18 at 03:01
"The values are not constrained apart from them being positive numbers", meaning x in [0, +inf). If it makes things easier you can assume it's between 0 and 100,000 (but that really is just guessing, as I have no control over what values comes through the API I'm consuming). – Mattx Jun 03 '18 at 03:08
In that case, using epsilons is simply not viable. The accuracy / precision for floating point values (and the epsilons on operations like `+` and `*`) are relative to the magnitude of values. You cannot calculate an absolute epsilon if you don't know the absolute values. *"I have no control over what values comes through the API I'm consuming ..."* - Then you should use `BigDecimal` ... and absorb the overheads. – Stephen C Jun 03 '18 at 03:17
It doesn't *need* to be an absolute epsilon, if it depends on the balance, the rate and the precision that's also fine. `floor(0.7*0.1 + eps(0.7, 0.1, 3), 3)` would work. – Mattx Jun 03 '18 at 03:29
Which was also stated from the very beginning "I feel this solution is rather hacky unless I have a good strategy for choosing the correct epsilon which probably depends on the numbers I'm dealing with." :) – Mattx Jun 03 '18 at 03:30
Please show me where your Question says that. By my reading, the examples in your Question show the epsilon as an absolute value. – Stephen C Jun 03 '18 at 03:35
Ctrl+F "rather hacky", there. The epsilon in my question is absolute, same for the other params (0.7, 0.1 and 3), do you also believe I only have to solve it for those numbers? – Mattx Jun 03 '18 at 03:42
Nope. The true (absolute) epsilon values are a function of the quantities that you are getting from the API. – Stephen C Jun 03 '18 at 03:46
This is going nowhere. – Mattx Jun 03 '18 at 03:47
Exactly. Please read the document ... if you want to go somewhere. – Stephen C Jun 03 '18 at 03:48
What you are proposing is an absolute epsilon that is not a function of the input quantities. No such absolute epsilon exists. The absolute epsilon is a function of the input quantities. If you can't put a bound on the quantities, then you cannot predict a range for the epsilon. – Stephen C Jun 03 '18 at 03:51

score 0 · Answer 2 · edited Jun 20 '20 at 09:12

A binary floating point number (double precision) has 53 bits of precision or approximately 15.95 decimal digits.

Let r be a Real and d be the double that is closest to r

epsilon(r) = |r - d| is in the range 0 to r * 2^{floor(log2(r)) -53}

and, if you have to deal with numbers in the range 0 to N, then the maximum epsilon value across the range will be approximately:

N * 2^{floor(log2(N)) - 53}

When you perform a computation you need to estimate the cumulative error for all of the steps in the computation. Addition, multiplication and division are relatively "safe". For example with multiplication:

Let r₁ = d₁ + e₁ and r₂ = d₂ + e₂

r₁ * r₂ = (d₁ + e₁) * (d₂ + e₂) = d₁ * d₂ + d₂ * e₁ + d₁ * e₂ + e₁ * e₂

Unless the epsilon values are already large, the e₁ * e₂ term vanishes relative to the others, and the epsilon is going to be ≤ 2 * max(|d₁|, |d₂|) * max(|e₁|, |e₂|).

(I think. It is a long, long time since I last did this stuff in anger.)

However, subtraction has nasty properties; see Theorem 9 of the Goldberg paper!

The floor function you are using is also a bit tricky.

Let r = d + e

epsilon(floor(r)) = |floor(r, 3) - floor(d, 3)|

which is ≤ max(|ceiling(e, 3)|, 10^-3)

Java Estimating Double's Representation Error

2 Answers2