Questions tagged [fast-math]

The `-ffast-math` (or a similarly-named) compiler option trading precision and adherence to the IEEE 754 floating point standard in favor of execution speed

Most compilers have an option for turning floating-point-related optimizations which sacrifice computational precision and/or adherence to the costlier corner-cases of the IEEE 754 floating-point standard - in favor of better execution speed.

For gcc and clang, this option is named -ffast-math (and there are sub-options)
For nvcc, the name is --use-fast-math
For OpenCL compilation, the name is -cl-fast-relaxed-math

For more information: What does gcc's ffast-math actually do?

44 questions

votes

1 answer

Do denormal flags like Denormals-Are-Zero (DAZ) affect comparisons for equality?

If I have 2 denormal floating point numbers with different bit patterns and compare them for equality, can the result be affected by the Denormals-Are-Zero flag, the Flush-to-Zero flag, or other flags on commonly used processors? Or do these flags…

asked Jan 04 '19 at 23:13

Zachary Burns

votes

2 answers

Dynamic -ffast-math

Is it possible to selectively turn -ffast-math on/off during runtime? For example, creating classes FastMath and AccurateMath with the common base class Math, so that one would be able to use both implementations during runtime? Ditto for flashing…

c performance gcc optimization fast-math

asked Aug 12 '13 at 21:58

Michael

5,775
2
34
53

votes

0 answers

How can I find math operations that would be optimized by `-ffast-math`?

The -ffast-math C++ compiler option allows the compiler to perform more math optimizations that may slightly change behavior. For example, x * 10 / 10 should cancel but due to the possibility of overflow it would slightly change behavior, and x /…

c++ math compiler-optimization static-analysis fast-math

asked Jun 21 '23 at 17:18

Aaron Franke

3,268
4
31
51

votes

2 answers

When should I use gcc's -Ofast optimization level?

In xcode 5, the Optimization level introduce a new level named -Ofast (Fastest,Aggressive Optimizations). When and how should I use this level?

gcc optimization fast-math

asked Sep 15 '13 at 02:39

user49354

votes

1 answer

Does -use-fast-math option translate SP multiplications to intrinsics?

I had a quick glance of the CUDA Programming guide w.r.t -use-fast-math optimizations, and although appendix C mention divisions to be converted to an intrinsic but there are no mention of multiplications. The reason I ask this question is, my…

cuda nvcc fast-math

asked Jul 16 '12 at 15:26

Sayan

2,662
10
41
56

votes

0 answers

Flag to generate 'deterministic' floating point operations wrt. pointer alignment with 'fast-math'?

The -ffast-math option in gcc allows the compiler to reorder floating-point operations to have a faster execution. This may lead to slight differences between the results of these operations depending on the alignment of pointers. For instance, on…

c gcc memory-alignment fast-math

asked Jul 09 '19 at 12:13

Marc

votes

1 answer

What is GCC/Clang equivalent of -fp-model fast=1 in ICC

As I read on Intel's website: Intel compiler uses /fp-model fast=1 as defaults. This optimization favors speed over standards compliance. You may use compiler option -mieee-fp to get compliant code. My understanding of the fp-model option in…

gcc floating-point clang icc fast-math

asked Apr 08 '16 at 13:58

marcin

3,351
1
29
33

votes

1 answer

Fortran code compiled in one Windows machine (2018 Intel processor) gives different results when exe is copied to other machine (2022 Intel processor)

Is it possible that a Fortran code compiled in one Windows machine with Visual studio 2019 on a 2018 Intel processor gives a slightly different result when the exe is copied to another machine (that has a 2022 Intel processor)? Could you please list…

floating-point fortran intel intel-fortran fast-math

asked Oct 28 '22 at 02:13

Millemila

1,612
4
24
45

votes

1 answer

Why is std::inner_product slower than the naive implementation?

This is my naive implementation of dot product: float simple_dot(int N, float *A, float *B) { float dot = 0; for(int i = 0; i < N; ++i) { dot += A[i] * B[i]; } return dot; } And this is using the C++ library: float…

c++ floating-point sse numeric fast-math

asked Mar 28 '17 at 20:38

ijklr

votes

1 answer

OpenCL Fast Relaxed Math

What does the OpenCL compiler option -cl-fast-relaxed-math do? From reading the documentation - it looks like -cl-fast-relaxed-math allows a kernel to do floating point math on any variables - even if those variables point to the wrong data type,…

opencl gpgpu fast-math

asked Aug 18 '13 at 08:55

benshope

2,936
4
27
39

votes

1 answer

Why can't the Rust compiler auto-vectorize this FP dot product implementation?

Lets consider a simple reduction, such as a dot product: pub fn add(a:&[f32], b:&[f32]) -> f32 { a.iter().zip(b.iter()).fold(0.0, |c,(x,y)| c+x*y)) } Using rustc 1.68 with -C opt-level=3 -C target-feature=+avx2,+fma I get .LBB0_5: …

rust floating-point simd auto-vectorization fast-math

asked Apr 19 '23 at 13:23

Unlikus

1,419
10
24

votes

2 answers

Mingw32 std::isnan with -ffast-math

I am compiling the following code with the -ffast-math option: #include #include #include int main() { std::cout << std::isnan(std::numeric_limits::quiet_NaN() ) << std::endl; } I am getting 0 as output. How…

g++ mingw32 fast-math

asked Aug 31 '11 at 21:05

André Puel

8,741
9
52
83

votes

0 answers

Why does -fno-signed-zeros have an effect on vectorization for minimum search?

See this simple minimum search (Godbolt): float foo(const float *data, int n) { float v = data[0]; for (int i = 1; i < n; i++) { float d = data[i]; if (d < v) { v = d; } } return v; } Neither gcc…

c++ gcc x86 auto-vectorization fast-math

asked Dec 06 '21 at 14:40

geza

28,403
6
61
135

votes

2 answers

Where is the source of imprecise calculation in the assembler code of gcc -Ofast compared with -O3?

The following 3 lines give imprecise results with "gcc -Ofast -march=skylake": int32_t i = -5; const double sqr_N_min_1 = (double)i * i; 1. - ((double)i * i) / sqr_N_min_1 Obviously, sqr_N_min_1 gets 25., and in the 3rd line (-5 * -5) / 25 should…

c assembly gcc x86-64 fast-math

asked May 10 '21 at 09:13

Hartmut Pfitzinger

2,304
3
28
48

votes

1 answer

What does the "denormal input" exactly mean in assembly when we consider using DAZ flag for SSE Floating Points

I've read This article and do-denormal-flags-like-denormals-are-zero-daz-affect-comparisons-for-equality and I understand the usage and difference between FTZ and DAZ flags. DAZ applies on input, FTZ on output from an FP operation. What confused me…

floating-point sse instructions fast-math denormal-numbers

asked Apr 27 '20 at 11:35

lionel

Prev 1

3 Next