Questions tagged [loop-unrolling]

Loop unrolling is a loop optimization strategy.

164 questions
9
votes
2 answers

Force/Convince/Trick GCC into Unrolling _Longer_ Loops?

How do I convince GCC to unroll a loop where the number of iterations is known, but large? I'm compiling with -O3. The real code in question is more complex, of course, but here's a boiled-down example that has the same behavior: int const…
jgustafson
  • 193
  • 1
  • 7
8
votes
1 answer

Unroll loop and do independent sum with vectorization

For the following loop GCC will only vectorize the loop if I tell it to use associative math e.g. with -Ofast. float sumf(float *x) { x = (float*)__builtin_assume_aligned(x, 64); float sum = 0; for(int i=0; i<2048; i++) sum += x[i]; return…
Z boson
  • 32,619
  • 11
  • 123
  • 226
8
votes
1 answer

GLSL shader not unrolling loop when needed

My 9600GT hates me. Fragment shader: #version 130 uint aa[33] = uint[33]( 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0 ); void main() { int i=0; int a=26; for (i=0; i
user2464424
  • 1,536
  • 1
  • 14
  • 28
8
votes
2 answers

Is there a way to unroll loops in an AMD OpenCL kernel with the compiler?

I'm trying to assess the performance differences between OpenCL for AMD and Nvidia GPUs. I have a kernel which performs matrix-vector multiplication. I'm running the kernel on two different systems at the moments, my laptop which has an NVidia…
8
votes
2 answers

In what types of loops is it best to use the #pragma unroll directive in CUDA?

In CUDA it is possible to unroll loops using the #pragma unroll directive to improve performance by increasing instruction level parallelism. The #pragma can optionally be followed by a number that specifies how many times the loop must be…
charis
  • 429
  • 6
  • 16
7
votes
1 answer

Should I look into PTX to optimize my kernel? If so, how?

Do you recommend reading your kernel's PTX code to find out to optimize your kernels further? One example: I read, that one can find out from the PTX code if the automatic loop unrolling worked. If this is not the case, one would have to unroll the…
Framester
  • 33,341
  • 51
  • 130
  • 192
7
votes
2 answers

Why doesn't Hotspot JIT perform loop unrolling for long counters?

I just read the Java Magazine article Loop Unrolling. There the authors demonstrate that simple for loops with an int counter are compiled with loop unrolling optimization: private long intStride1() { long sum = 0; for (int i = 0; i < MAX; i…
7
votes
2 answers

Loop unrolling in clang

I am trying to selectively unroll the second loop in the following program: #include int main() { int in[1000], out[1000]; int i,j; #pragma nounroll for (i = 100; i < 1000; i++) { in[i]+= 10; } …
k01
  • 71
  • 1
  • 1
  • 3
7
votes
3 answers

Correct way of unrolling loop using gcc

#include int main() { int i; for(i=0;i<10000;i++){ printf("%d",i); } } I want to do loop unrolling on this code using gcc but even using the flag. gcc -O2 -funroll-all-loops --save-temps unroll.c the…
Neel Choudhury
  • 573
  • 1
  • 5
  • 11
6
votes
3 answers

template arguments inside a compile time unrolled for loop?

wikipedia (here) gives a compile time unrolling of for loop....... i was wondering can we use a similar for loop with template statements inside... for example... is the following loop valid template void…
user796530
6
votes
2 answers

Why do neither V8 nor spidermonkey seem to unroll static loops?

Doing a small check, it looks like neither V8 nor spidermonkey unroll loops, even if it is completely obvious, how long they are (literal as condition, declared locally): const f = () => { let counter = 0; for (let i = 0; i < 100_000_000; i++)…
Doofus
  • 952
  • 4
  • 19
6
votes
3 answers

Is Duff's device still useful?

I see that Duff's device is just to do loop unrolling in C. https://en.wikipedia.org/wiki/Duff%27s_device I am not sure why it is still useful nowadays. Isn't that the compiler should be smart enough to do loop-unrolling?
user1424739
  • 11,937
  • 17
  • 63
  • 152
6
votes
1 answer

Is it beneficial anymore to unroll loops in C++ over fixed-sized arrays?

I want to use the std::array to store the data of N-dimensional vectors and implement arithmetic operations for such vectors. I figured, since the std::array now has a constexpr size() member function, I can use this to unroll the loops that I need…
tmaric
  • 5,347
  • 4
  • 42
  • 75
6
votes
6 answers

Unrolling loops using templates in C++ with partial specialization

I'm trying to use templates to unroll a loop in C++ as follows. #include template< class T, T i > struct printDown { static void run(void) { std::cout << i << "\n"; printDown< T, i - 1 >::run(); } }; template<…
6
votes
1 answer

GCC 5.1 Loop unrolling

Given the following code #include int main(int argc, char **argv) { int k = 0; for( k = 0; k < 20; ++k ) { printf( "%d\n", k ) ; } } Using GCC 5.1 or later with -x c -std=c99 -O3 -funroll-all-loops --param…
surrz
  • 295
  • 2
  • 11
1
2
3
10 11