Which gcc O2 flag may cause failure in fp calculation?

Question

I compiled paranoia floating point test suit on a pc386 system using GCC O2 level of optimization and got several failures but then compiled it without optimization with the same GCC and got correct result. I read about the flags which are enabled in O2 but none seems to be problematic. What may be the cause? The paranoia code can be found here and this is the taken output with O2 optimization :

*** PARANOIA TEST ***
paranoia version 1.1 [cygnus]
Program is now RUNNING tests on small integers:
TEST: 0+0 != 0, 1-1 != 0, 1 <= 0, or 1+1 != 2
PASS: 0+0 != 0, 1-1 != 0, 1 <= 0, or 1+1 != 2
TEST: 3 != 2+1, 4 != 3+1, 4+2*(-2) != 0, or 4-3-1 != 0
PASS: 3 != 2+1, 4 != 3+1, 4+2*(-2) != 0, or 4-3-1 != 0
TEST: -1+1 != 0, (-1)+abs(1) != 0, or -1+(-1)*(-1) != 0
PASS: -1+1 != 0, (-1)+abs(1) != 0, or -1+(-1)*(-1) != 0
TEST: 1/2 + (-1) + 1/2 != 0
PASS: 1/2 + (-1) + 1/2 != 0
TEST: 9 != 3*3, 27 != 9*3, 32 != 8*4, or 32-27-4-1 != 0
PASS: 9 != 3*3, 27 != 9*3, 32 != 8*4, or 32-27-4-1 != 0
TEST: 5 != 4+1, 240/3 != 80, 240/4 != 60, or 240/5 != 48
PASS: 5 != 4+1, 240/3 != 80, 240/4 != 60, or 240/5 != 48
-1, 0, 1/2, 1, 2, 3, 4, 5, 9, 27, 32 & 240 are O.K.

Searching for Radix and Precision.
Radix = 2.000000 .
Closest relative separation found is U1 = 5.4210109e-20 .

Recalculating radix and precision
 confirms closest relative separation U1 .
Radix confirmed.
TEST: Radix is too big: roundoff problems
PASS: Radix is too big: roundoff problems
TEST: Radix is not as good as 2 or 10
PASS: Radix is not as good as 2 or 10
TEST: (1-U1)-1/2 < 1/2 is FALSE, prog. fails?
ERROR: Severity: FAILURE:  (1-U1)-1/2 < 1/2 is FALSE, prog. fails?.
PASS: (1-U1)-1/2 < 1/2 is FALSE, prog. fails?
TEST: Comparison is fuzzy,X=1 but X-1/2-1/2 != 0
PASS: Comparison is fuzzy,X=1 but X-1/2-1/2 != 0
The number of significant digits of the Radix is 64.000000 .
TEST: Precision worse than 5 decimal figures  
PASS: Precision worse than 5 decimal figures  
TEST: Subtraction is not normalized X=Y,X+Z != Y+Z!
PASS: Subtraction is not normalized X=Y,X+Z != Y+Z!
Subtraction appears to be normalized, as it should be.
Checking for guard digit in *, /, and -.
TEST: * gets too many final digits wrong.

PASS: * gets too many final digits wrong.

TEST: Division lacks a Guard Digit, so error can exceed 1 ulp
or  1/3  and  3/9  and  9/27 may disagree
PASS: Division lacks a Guard Digit, so error can exceed 1 ulp
or  1/3  and  3/9  and  9/27 may disagree
TEST: Computed value of 1/1.000..1 >= 1
PASS: Computed value of 1/1.000..1 >= 1
TEST: * and/or / gets too many last digits wrong
PASS: * and/or / gets too many last digits wrong
TEST: - lacks Guard Digit, so cancellation is obscured
ERROR: Severity: SERIOUS DEFECT:  - lacks Guard Digit, so cancellation is obscured.
PASS: - lacks Guard Digit, so cancellation is obscured
Checking rounding on multiply, divide and add/subtract.
TEST: X * (1/X) differs from 1
PASS: X * (1/X) differs from 1
* is neither chopped nor correctly rounded.
/ is neither chopped nor correctly rounded.
TEST: Radix * ( 1 / Radix ) differs from 1
PASS: Radix * ( 1 / Radix ) differs from 1
TEST: Incomplete carry-propagation in Addition
PASS: Incomplete carry-propagation in Addition
Addition/Subtraction neither rounds nor chops.
Sticky bit used incorrectly or not at all.
TEST: lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below
ERROR: Severity: FLAW:  lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below.
PASS: lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below

Does Multiplication commute?  Testing on 20 random pairs.
     No failures found in 20 integer pairs.

Running test of square root(x).
TEST: Square root of 0.0, -0.0 or 1.0 wrong
PASS: Square root of 0.0, -0.0 or 1.0 wrong
Testing if sqrt(X * X) == X for 20 Integers X.
Test for sqrt monotonicity.
ERROR: Severity: DEFECT:  sqrt(X) is non-monotonic for X near 2.0000000e+00 .
Testing whether sqrt is rounded or chopped.
Square root is neither chopped nor correctly rounded.
Observed errors run from -5.5000000e+00 to 5.0000000e-01 ulps.
TEST: sqrt gets too many last digits wrong
ERROR: Severity: SERIOUS DEFECT:  sqrt gets too many last digits wrong.
PASS: sqrt gets too many last digits wrong
Testing powers Z^i for small Integers Z and i.
ERROR: Severity: DEFECT:  computing
        (1.30000000000000000e+01) ^ (1.70000000000000000e+01)
        yielded 8.65041591938133811e+18;
        which compared unequal to correct 8.65041591938133914e+18 ;
                they differ by -1.02400000000000000e+03 .
Errors like this may invalidate financial calculations
        involving interest rates.
Similar discrepancies have occurred 5 times.
Seeking Underflow thresholds UfThold and E0.
ERROR: Severity: FAILURE:  multiplication gets too many last digits wrong.
Smallest strictly positive number found is E0 = 0 .
ERROR: Severity: FAILURE:  Either accuracy deteriorates as numbers
approach a threshold = 0.00000000000000000e+00
 coming down from 0.00000000000000000e+00
 or else multiplication gets too many last digits wrong.

The Underflow threshold is 0.00000000000000000e+00,  below which
calculation may suffer larger Relative error than merely roundoff.
Since underflow occurs below the threshold
UfThold = (2.00000000000000000e+00) ^ (-inf)
only underflow should afflict the expression
        (2.00000000000000000e+00) ^ (-inf);
actually calculating yields: 0.00000000000000000e+00 .
This computed value is O.K.

Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065041e+00 as X -> 1.
ERROR: Severity: DEFECT:  Calculated 1.00000000000000000e+00 for
        (1 + (0.00000000000000000e+00) ^ (inf);
        differs from correct value by -6.38905609893065041e+00 .
        This much error may spoil financial
        calculations involving tiny interest rates.
Testing powers Z^Q at four nearly extreme values.
 ... no discrepancies found.

Searching for Overflow threshold:
This may generate an error.
Can `Z = -Y' overflow?
Trying it on Y = -inf .
finds a ERROR: Severity: FLAW:  -(-Y) differs from Y.
Overflow threshold is V  = -inf .
Overflow saturates at V0 = inf .
No Overflow should be signaled for V * 1 = -inf
                           nor for V / 1 = -inf .
Any overflow signal separating this * from the one
above is a DEFECT.
ERROR: Severity: FAILURE:  Comparisons involving +--inf, +-inf
and +-0 are confused by Overflow.
ERROR: Severity: SERIOUS DEFECT:    X / X differs from 1 when X = 1.00000000000000000e+00
  instead, X / X - 1/2 - 1/2 = 1.08420217248550443e-19 .
ERROR: Severity: SERIOUS DEFECT:    X / X differs from 1 when X = -inf
  instead, X / X - 1/2 - 1/2 = nan .
ERROR: Severity: SERIOUS DEFECT:    X / X differs from 1 when X = 0.00000000000000000e+00
  instead, X / X - 1/2 - 1/2 = nan .

What message and/or values does Division by Zero produce?
    Trying to compute 1 / 0 produces ...  inf .

    Trying to compute 0 / 0 produces ...  nan .

The number of  FAILUREs  encountered =       4.
The number of  SERIOUS DEFECTs  discovered = 5.
The number of  DEFECTs  discovered =         3.
The number of  FLAWs  discovered =           2.

The arithmetic diagnosed has unacceptable Serious Defects.
Potentially fatal FAILURE may have spoiled this program's subsequent diagnoses.
END OF TEST.
*** END OF PARANOIA TEST ***

EXECUTIVE SHUTDOWN! Any key to reboot...

I think it would make for a better question if you provided the details of the failures as well as the relevant code (as far as practical). Knowing the `gcc` version number wouldn't hurt either. — NPE, Aug 11 '13 at 10:51
Of course, if you could reduce one or more failures to an SSCCE (http://sscce.org/), that would be even better. — NPE, Aug 11 '13 at 10:52
A recent version of GCC on modern hardware can take advantage of options `-msse2 -mfpmath=sse` to generate assembly that compute each expression exactly to the precision of its type. It may also be informative to look at Paranoia's source code from the point of view of this description: http://gcc.gnu.org/ml/gcc-patches/2008-11/msg00105.html . If a recent GCC is not already generating strict IEEE 754 code for floating-point, either `-std=c99` or `-fexcess-precision=standard` make the generated assembly conform to the interpretation laid out by Joseph S. Myers. — Pascal Cuoq, Aug 11 '13 at 18:02

score 3 · Accepted Answer · answered Aug 11 '13 at 16:22

Optimization and the -O2 is not the primary culprit here. The test suite you are running can fail in a C implementation with other optimization scenarios. The primary problem in this case appears to be that the Paranoia test is testing whether floating-point arithmetic is consistent and has various properties, but the floating-point arithmetic in the C implementation you are using is not consistent because sometimes it uses 80-bit arithmetic and sometimes it uses 64-bit arithmetic (or an approximation to it, such as using 80-bit arithmetic but rounding results to 64-bit floating-point).

Initially, the test finds a number U1 such that 1-U1 differs from 1, and there are no representable values between 1-U1 and 1. That is, U1 is the step size from 1 down to the next representable value in the floating-point format. In your case, the test finds that U1 is about 5.4210109e-20. This U1 is exactly 2^-64. The Intel processor you are running on has an 80-bit floating-point format in which the significand (the fraction part of the floating-point representation) has 64 bits. This 64-bit width of the significand is responsible for the step size being 2^-64, so it is why U1 is 2^-64.

Later, the test evaluates (1-U1)-1/2 and compares it to 1/2. Since 1-U1 is less than 1, subtracting 1/2 should make produce a result less than 1/2. However, in this case, your C implementation is evaluating 1-U1 with 64-bit arithmetic, which has a 53-bit significand. With a 53-bit significand, 1-U1 cannot be represented exactly. Since it is very close to 1, the mathematical value of 1-U1 is rounded to 1 in the 64-bit format. Then subtracting 1/2 from this 1 yields 1/2. This 1/2 is not less than 1/2, so the comparison fails, and the program reports an error.

This is a defect of your C implementation. It actually evaluates 1-U1 differently in one place than in another. It uses 80-bit arithmetic in one place and 64-bit in another, and it does not provide a good way to control this. (But there may be switches to use only 64-bit arithmetic; I do not know about your version of GCC.)

Although this is a defect by the standards of people who want good floating-point arithmetic, it is not a defect according to the C standard. The C language standard permits this behavior.

I have not examined failures reported after the first. They likely stem from similar causes.

GCC version is 4.4.7. Does optimization changes the floating point precision? If it doesn't so why compiling without optimization results in correct answers? — Ardalan Pouyabahar, Aug 11 '13 at 18:56
@ArdalanPouyabahar Older GCC versions do not guarantee that the same results will be produced by floating-point computations with and without optimization. Simple computations may be done at compile-time with different semantics than run-time semantics. See http://arxiv.org/abs/cs/0701192 for a rather complete report on the difficulties of prediction floating-point behavior with this kind of compiler, or http://blog.frama-c.com/index.php?post/2013/07/06/On-the-precise-analysis-of-C-programs-for-FLT_EVAL_METHOD-2 and http://blog.frama-c.com/index.php?post/2013/07/24/More-on-FLT_EVAL_METHOD_2 — Pascal Cuoq, Aug 11 '13 at 20:19
@PascaCuoq The links was so helpful but as I checked the 4.4.7 version is for 2012, does it have that problem too? — Ardalan Pouyabahar, Aug 13 '13 at 19:02
@ArdalanPouyabahar: Floating-point precision changes at the “whim” of the compiler. Consider when the compiler is trying to evaluate an expression and decides it is out of registers, so it temporarily saves a value by rounding it from an 80-bit floating-point format to a 64-bit format and writes it to the stack. That sort of temporary save might or might not happen when expressions are reorganized, when you change compiler versions, when you change optimization switches, whenever. It is **uncontrolled**. — Eric Postpischil, Aug 13 '13 at 19:11
For what it is worth, I compiled paranoia.c with Apple Clang 4.0 and ran it on a MacPro4,1, using `-pedantic -std=c99` with `-O0` and with `-O3`, and both runs reported no errors. — Eric Postpischil, Aug 13 '13 at 19:13

Which gcc O2 flag may cause failure in fp calculation?

1 Answers1