Loop unrolling is a loop optimization strategy.
Questions tagged [loop-unrolling]
164 questions
9
votes
2 answers
Force/Convince/Trick GCC into Unrolling _Longer_ Loops?
How do I convince GCC to unroll a loop where the number of iterations is known, but large?
I'm compiling with -O3.
The real code in question is more complex, of course, but here's a boiled-down example that has the same behavior:
int const…

jgustafson
- 193
- 1
- 7
8
votes
1 answer
Unroll loop and do independent sum with vectorization
For the following loop GCC will only vectorize the loop if I tell it to use associative math e.g. with -Ofast.
float sumf(float *x)
{
x = (float*)__builtin_assume_aligned(x, 64);
float sum = 0;
for(int i=0; i<2048; i++) sum += x[i];
return…

Z boson
- 32,619
- 11
- 123
- 226
8
votes
1 answer
GLSL shader not unrolling loop when needed
My 9600GT hates me.
Fragment shader:
#version 130
uint aa[33] = uint[33](
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0
);
void main() {
int i=0;
int a=26;
for (i=0; i

user2464424
- 1,536
- 1
- 14
- 28
8
votes
2 answers
Is there a way to unroll loops in an AMD OpenCL kernel with the compiler?
I'm trying to assess the performance differences between OpenCL for AMD and Nvidia GPUs. I have a kernel which performs matrix-vector multiplication. I'm running the kernel on two different systems at the moments, my laptop which has an NVidia…

andymr
- 113
- 1
- 7
8
votes
2 answers
In what types of loops is it best to use the #pragma unroll directive in CUDA?
In CUDA it is possible to unroll loops using the #pragma unroll directive to improve performance by increasing instruction level parallelism. The #pragma can optionally be followed by a number that specifies how many times the loop must be…

charis
- 429
- 6
- 16
7
votes
1 answer
Should I look into PTX to optimize my kernel? If so, how?
Do you recommend reading your kernel's PTX code to find out to optimize your kernels further?
One example: I read, that one can find out from the PTX code if the automatic loop unrolling worked. If this is not the case, one would have to unroll the…

Framester
- 33,341
- 51
- 130
- 192
7
votes
2 answers
Why doesn't Hotspot JIT perform loop unrolling for long counters?
I just read the Java Magazine article Loop Unrolling. There the authors demonstrate that simple for loops with an int counter are compiled with loop unrolling optimization:
private long intStride1()
{
long sum = 0;
for (int i = 0; i < MAX; i…

Leprechaun
- 852
- 6
- 25
7
votes
2 answers
Loop unrolling in clang
I am trying to selectively unroll the second loop in the following program:
#include
int main()
{
int in[1000], out[1000];
int i,j;
#pragma nounroll
for (i = 100; i < 1000; i++)
{
in[i]+= 10;
}
…

k01
- 71
- 1
- 1
- 3
7
votes
3 answers
Correct way of unrolling loop using gcc
#include
int main() {
int i;
for(i=0;i<10000;i++){
printf("%d",i);
}
}
I want to do loop unrolling on this code using gcc
but even using the flag.
gcc -O2 -funroll-all-loops --save-temps unroll.c
the…

Neel Choudhury
- 573
- 1
- 5
- 11
6
votes
3 answers
template arguments inside a compile time unrolled for loop?
wikipedia (here) gives a compile time unrolling of for loop.......
i was wondering can we use a similar for loop with template statements inside...
for example...
is the following loop valid
template
void…
user796530
6
votes
2 answers
Why do neither V8 nor spidermonkey seem to unroll static loops?
Doing a small check, it looks like neither V8 nor spidermonkey unroll loops, even if it is completely obvious, how long they are (literal as condition, declared locally):
const f = () => {
let counter = 0;
for (let i = 0; i < 100_000_000; i++)…

Doofus
- 952
- 4
- 19
6
votes
3 answers
Is Duff's device still useful?
I see that Duff's device is just to do loop unrolling in C.
https://en.wikipedia.org/wiki/Duff%27s_device
I am not sure why it is still useful nowadays. Isn't that the compiler should be smart enough to do loop-unrolling?

user1424739
- 11,937
- 17
- 63
- 152
6
votes
1 answer
Is it beneficial anymore to unroll loops in C++ over fixed-sized arrays?
I want to use the std::array to store the data of N-dimensional vectors and implement arithmetic operations for such vectors. I figured, since the std::array now has a constexpr size() member function, I can use this to unroll the loops that I need…

tmaric
- 5,347
- 4
- 42
- 75
6
votes
6 answers
Unrolling loops using templates in C++ with partial specialization
I'm trying to use templates to unroll a loop in C++ as follows.
#include
template< class T, T i >
struct printDown {
static void run(void) {
std::cout << i << "\n";
printDown< T, i - 1 >::run();
}
};
template<…

Ashley
- 829
- 1
- 5
- 16
6
votes
1 answer
GCC 5.1 Loop unrolling
Given the following code
#include
int main(int argc, char **argv)
{
int k = 0;
for( k = 0; k < 20; ++k )
{
printf( "%d\n", k ) ;
}
}
Using GCC 5.1 or later with
-x c -std=c99 -O3 -funroll-all-loops --param…

surrz
- 295
- 2
- 11