How to weigh up calculation error

Question

Consider the following example. There is an image where user can select rectangular area (part of it). The image is displayed with some scale. Then we change the scale and we need to recalculate the new coordinates of selection. Let's take width,

newSelectionWidth = round(oldSelectionWidth / oldScale * newScale)

where oldScale = oldDisplayImageWidth / realImageWidth, newScale = newDisplayImageWidth / realImageWidth, all the values except for scales are integers.

The question is how to prove that newSelectionWidth = newDisplayImageWidth given oldSelectionWidth = oldDisplayImageWidth for any value of oldDisplayImageWidth, newDisplayImageWidth, realImageWidth? Or under what conditions this doesn't hold?

I was thinking about the answer too and this is what I've come up with, may be inaccurate and/or incomplete.

All numbers in JavaScript are double-precision numbers. Generally, this gives us maximum error of about 10^-16 (machine epsilon). This means in order to have error of 0.5 or more, (1) we need to perform 0.5 / 10^-16 = 5·10¹⁵ operations. The other source of error is calculation with too big (|value| > 1.7976931348623157·10³⁰⁸) or too low numbers (|value| < 2.2250738585072014·10^-308) (link). This means (2) if somewhere in the course of calculation we get too big or too low number, e.g. because oldDisplayImageWidth / realImageWidth > 1.7976931348623157·10³⁰⁸ or the like, then the error might exceed 0.5. Granted we're talking about displaying images on today's monitors, all these conditions are extremely unlikely.

If you are worried about error causing you to exceed extents, then you will need to clip them. — Orbling, May 31 '13 at 17:18
I'd like to know if it can exceed image extents. I don't like the idea of adding superfluous code. — x-yuri, May 31 '13 at 17:23

score 1 · Answer 1 · answered Jun 06 '13 at 13:44

1

You're mixing up absolute and relative error. Assuming a relative error of 10^-16 you end up with a maximum relative error of 4 * 10^-16 after the 4 operations in your example. You want an absolute error < 0.5, so you're fine, as long as newSelectionWidth * 4 * 10^-16 < 0.5.

answered Jun 06 '13 at 13:44

Henrik

23,186
6
42
92

The maximum error in a single operation is not 10^-16; it exceeds 1.11•10^-16. And the relative error in four operations is not limited to four times the error of one operation. In this case, it is only slightly greater than four but, in other operation sequences, it may be infinite. Your constraint on `newSelectionWidth` limits it to 1250000000000000, but a simple analysis of the errors requires a stricter limit, 1125899906842624 (as shown in my answer). It is likely a more involved analysis could relax the constraint, by showing that certain combinations of errors are impossible. – Eric Postpischil Jun 06 '13 at 14:04
@EricPostpischil A rough estimate is more than enough in this case, so I ignored 10^-16 vs. 1.11•10^-16 and (1+e)^4-1 vs. 4*e. Of course, calculating the relative error like this is only possible in case of multiplication and division as in this example. – Henrik Jun 06 '13 at 14:12
How do you know a rough estimate is enough if you do not have a proof? You can do a rough estimate **and** have a correct proof by using approximations on the safe sides of the exact values instead of on the wrong sides. E.g., 1.2•10^-16 instead of 10^-16 (or 2•10^-16 or 10^-15 if you want simpler numbers) and use 4.1 or 5 for the error multiplier. – Eric Postpischil Jun 06 '13 at 18:12

Eric Postpischil · Accepted Answer · 2013-06-06T14:13:19.783

If newDisplayWidth is less than 1125899906842624 and the other integers are positive and do not exceed 53 bits, then newSelectionWidth equals newDisplayWidth. A proof follows.

Notation:

I will use the term double to name the floating-point type being used, IEEE-754 64-bit binary.
Text in code style represents computed values, while plain text represents mathematical values. Thus 1/3 is exactly one-third, while 1./3. is the result of dividing 1 by 3 in floating-point arithmetic.

I assume:

The widths are positive integers not wider than the double significand (53 bits).
The divisions oldDisplayImageWidth / realImageWidth and newDisplayImageWidth / realImageWidth are performed in double arithmetic with the operands converted to double.

The limits on the integers assures that conversion to double is exact and that overflow and underflow are not encountered during the operations used in this problem.

Consider oldScale, which is a double set to oldDisplayImageWidth / realImageWidth. The maximum error in a single floating-point operation in round-to-nearest mode is half an ULP (because every mathematical number is no farther than half an ULP from a representable number). Thus, oldScale equals oldDisplayImageWidth / realImageWidth • (1+e₀), where e₀ represents the relative error and is at most half a double epsilon. (The double epsilon is 2^-52, so |e₀| ≤ 2^-53.)

Similarly, newScale is newDisplayImageWidth / realImageWidth • (1+e₁), where e₁ is some error that is at most 2^-53.

Then oldSelectionWidth / oldScale is oldSelectionWidth / oldScale • (1+e₂), again for some e₂ ≤ 2^-53, and oldSelectionWidth / oldScale * newScale is oldSelectionWidth / oldScale • (1+e₂) • newScale • (1+oldSelectionWidth / oldScale • (1+e₃). Note that this is the argument passed to round.

Now substitute the expressions we have for oldScale and newScale. This yields oldSelectionWidth / (oldDisplayImageWidth / realImageWidth • (1+e₀)) • (1+e₂) • (newDisplayImageWidth / realImageWidth • (1+e₁)) • (1+e₃). The realImageWidth terms cancel, and we can rearrange the others to produce oldSelectionWidth • newDisplayImageWidth / oldDisplayImageWidth • (1+e₁) • (1+e₂) • (1+e₃) / (1+e₀).

We are given that oldSelectionWidth equals oldDisplayImageWidth, so those cancel, and the argument to round is exactly: newDisplayImageWidth • (1+e₁) • (1+e₂) • (1+e₃) / (1+e₀).

Consider the combined error terms minus one (this is the relative error in the final value): (1+e₁) • (1+e₂) • (1+e₃) / (1+e₀) – 1. This expression has greatest magnitude when e₀ is –2^-53 and the others are +2^-53. Then it is slightly greater than 2 ULP (at most 324518553658426753804753784799233 / 730750818665451377972204001751459814038961127424). If newDisplayImageWidth is less than 1125899906842624, then newDisplayImageWidth times this relative error is less than ½. Therefore, newDisplayImageWidth • (1+e₁) • (1+e₂) • (1+e₃) / (1+e₀) would be within ½ of newDisplayImageWidth.

Since newDisplayImageWidth is an integer, if the argument to round is within ½ of newDisplayWidth, then the result is newDisplayWidth.

Therefore, if newDisplayWidth is less than 1125899906842624, then newSelectionWidth equals newDisplayWidth.

(The above proves that 1125899906842624 is a sufficient limit, but it may not be necessary. A more involved analysis may be able to prove that certain combinations of errors are impossible, so the maximum combined error is less than used above. This would relax the limit, allowing larger values of newDisplayWidth.)

Thanks for detailed answer, but could you elaborate on what exactly could relax the limit? And what's with other operations? Are there those which demand more than multiplying by `(1 + e)`? — x-yuri, Jun 08 '13 at 01:48
@x-yuri: My statement about relaxing the limit is that, if we worked harder on the proof, we might be able to prove that the result holds even if newDisplayWidth is somewhat higher than 1,125,899,906,842,624. However, unless you intend to work with display widths greater than a quadrillion, there is no point to doing this work. — Eric Postpischil, Jun 08 '13 at 01:52
Also, AFAIU "integers which do not exceed 53 bits" and "integers not wider than the double significand" are those less then or equal to `2^53 - 1`. Am I right? — x-yuri, Jun 08 '13 at 01:57
Also, I believe relative error is `|(1 + e1) • (1 + e2) • (1 + e3) / (1 + e0) – 1|`. But at least in this particular case `(1 + 2^-53) / (1 - 2^-53) - 1 = 1 - (1 - 2^-53) / (1 + 2^-53) = .00000000000000022204`. Not sure, if it's as expected... — x-yuri, Jun 08 '13 at 02:04
@x-yuri: I think the only use I made of the limits on the integers, other than newDisplayWidth, was to ensure that converting them to `double` has no error. For this purpose, all integers less than or equal to 2^53 are satisfactory. (2^53 exceeds 53 bits, but it is exactly representable since its lowest integer bit, which is evicted from the floating-point format due to the number’s magnitude, is zero. 2^53+1 is the first integer not representable in `double`, because there is no room for the necessary 1 bit.) The proof holds for larger integers that are representable, such as 2^53+2. — Eric Postpischil, Jun 08 '13 at 02:25
and btw, what about getting too big or too small numbers somewhere in the course of calculations? Is it impossible within the limits you declared? — x-yuri, Jun 08 '13 at 11:53
@x-yuri: Yes, for inputs under 2^53, all the computed values are well within floating-point limits. No underflow or overflow occurs. — Eric Postpischil, Jun 08 '13 at 13:12

How to weigh up calculation error

2 Answers2

Linked