Compiler optimization involves adapting a compiler to reduce run-time or object size or both. This can be accomplished using compiler arguments (i.e. CFLAGS, LDFLAGS), compiler plugins (DEHYDRA for instance) or direct modifications to the compiler (such as modifying source code).
Questions tagged [compiler-optimization]
3117 questions
65
votes
3 answers
Why don't modern compilers coalesce neighboring memory accesses?
Consider the following code:
bool AllZeroes(const char buf[4])
{
return buf[0] == 0 &&
buf[1] == 0 &&
buf[2] == 0 &&
buf[3] == 0;
}
Output assembly from Clang 13 with -O3:
AllZeroes(char const*): …
user17507206
65
votes
3 answers
Can the compiler optimize from heap to stack allocation?
As far as compiler optimizations go, is it legal and/or possible to change a heap allocation to a stack allocation? Or would that break the as-if rule?
For example, say this is the original version of the code
{
Foo* f = new Foo();
…

Cory Kramer
- 114,268
- 16
- 167
- 218
65
votes
2 answers
Why does MSVS not optimize away +0?
This question demonstrates a very interesting phenomenon: denormalized floats slow down the code more than an order of magnitude.
The behavior is well explained in the accepted answer. However, there is one comment, with currently 153 upvotes, that…

Vorac
- 8,726
- 11
- 58
- 101
64
votes
2 answers
How does GCC optimize out an unused variable incremented inside a loop?
I wrote this simple C program:
int main() {
int i;
int count = 0;
for(i = 0; i < 2000000000; i++){
count = count + 1;
}
}
I wanted to see how the gcc compiler optimizes this loop (clearly add 1 2000000000 times should be…

Haile
- 3,120
- 3
- 24
- 40
64
votes
4 answers
Compiler optimizations may cause integer overflow. Is that okay?
I have an int x. For simplicity, say ints occupy the range -2^31 to 2^31-1. I want to compute 2*x-1. I allow x to be any value 0 <= x <= 2^30. If I compute 2*(2^30), I get 2^31, which is an integer overflow.
One solution is to compute 2*(x-1)+1.…

mbang
- 855
- 2
- 9
64
votes
9 answers
Why don't compilers merge redundant std::atomic writes?
I'm wondering why no compilers are prepared to merge consecutive writes of the same value to a single atomic variable, e.g.:
#include
std::atomic y(0);
void f() {
auto order = std::memory_order_relaxed;
y.store(1, order);
…

PeteC
- 1,047
- 9
- 15
63
votes
5 answers
Is a C compiler allowed to coalesce sequential assignments to volatile variables?
I'm having a theoretical (non-deterministic, hard to test, never happened in practice) hardware issue reported by hardware vendor where double-word write to certain memory ranges may corrupt any future bus transfers.
While I don't have any…

Andreas
- 5,086
- 3
- 16
- 36
63
votes
5 answers
Why is there no implicit parallelism in Haskell?
Haskell is functional and pure, so basically it has all the properties needed for a compiler to be able to tackle implicit parallelism.
Consider this trivial example:
f = do
a <- Just 1
b <- Just $ Just 2
-- ^ The above line does not utilize…

Nikita Volkov
- 42,792
- 11
- 94
- 169
61
votes
4 answers
Why is this seemingly slower C loop actually twice as fast as the other way?
I'm an R developer who uses C for algorithmic purposes and have a question about why a C loop that seems like it would be slow is actually faster than the alternative approach.
In R, our Boolean type can actually have three values, true, false, and…

Davis Vaughan
- 2,780
- 9
- 19
61
votes
8 answers
Do unused functions get optimized out?
Compilers these days tend to do a significant amount of optimizations. Do they also remove unused functions from the final output?

Paul Manta
- 30,618
- 31
- 128
- 208
58
votes
3 answers
Optimize in CMake by default
I have a C++ project which uses CMake as its build system. I'd like the following behavior:
If cmake is invoked as cmake .., then CMAKE_CXX_FLAGS is -O3 -Wall -Wextra
If cmake is invoked as cmake .. -DCMAKE_BUILD_TYPE=Debug, then CMAKE_CXX_FLAGS is…

marmistrz
- 5,974
- 10
- 42
- 94
58
votes
1 answer
Pragmatics of typed intermediate languages
One trend in the compilation is to use typed intermediate languages. Haskell's ghc with its core intermediate language, a variant of System F-omega, is an example of this architecture [ 1 ]. Another is LLVM, which has a typed intermediate language…

Martin Berger
- 1,120
- 9
- 19
57
votes
2 answers
Understanding the as-if rule, "the program was executed as written"
I am trying to understand the as-if rule. According to cppreference:
The as-if rule
Allows any and all code transformations that do not change the observable behavior of the program
Explanation
The C++ compiler is permitted to perform any…

poohRui
- 613
- 5
- 9
57
votes
3 answers
Java program runs slower when code that is never executed is commented out
I observed some strange behaviour in one of my Java programs. I have tried to strip the code down as much as possible while still being able to replicate the behaviour. Code in full below.
public class StrangeBehaviour {
static boolean…

J3D1
- 759
- 6
- 10
56
votes
4 answers
Why do none of the major compilers optimize this conditional store that checks if the value is already set?
I stumbled across this Reddit post which is a joke on the following code snippet,
void f(int& x) {
if (x != 1) {
x = 1;
}
}
void g(int& x) {
x = 1;
}
saying that the two functions are not equivalent to 'the compiler'.
I was…

chrysante
- 2,328
- 4
- 24