Loop unrolling is a loop optimization strategy.
Questions tagged [loop-unrolling]
164 questions
6
votes
2 answers
Loop unroll (with bitwise operations)
I am writing a Linux Kernel driver (for ARM) and in an irq handler I need to check the interrupt bits.
bit
0/16 End point 0 In/Out interrupt
(very likely, while In is more likely)
1/17 End point 1 In/Out interrupt
...
15/31 End point 15…

Alvin Wong
- 12,210
- 5
- 51
- 77
5
votes
1 answer
How to give the compiler the hint about the maximum time a loop would run
// if I know that in_x will never be bigger than Max
template
void foo(unsigned in_x)
{
unsigned cap = Max;
// I can tell the compiler this loop will never run more than log(Max) times
for (; cap != 0 && in_x != 0; cap…

BlueWanderer
- 2,671
- 2
- 21
- 36
5
votes
1 answer
`Out of resources` error while doing loop unrolling
When I increase the unrolling from 8 to 9 loops in my kernel, it breaks with an out of resources error.
I read in How do I diagnose a CUDA launch failure due to being out of resources? that a mismatch of parameters and an overuse of registers could…

Framester
- 33,341
- 51
- 130
- 192
5
votes
2 answers
Vectorization: when is worth manually unrolling loops?
I would like to have a general understanding of when I can expect a compiler to vectorize a loop and when it is worth for me to unroll the loop to help it decides to use vectorization.
I understand the details are very important (what compiler,…

luca
- 7,178
- 7
- 41
- 55
5
votes
2 answers
C loop unrolling optimization performance
First: I know what loop optimization is and how it works yet I found a case where I cannot explain the results.
I created a prime number checker that calls modulo on each number from 2 to n - 1, so no algorithmical optimizations.
EDIT: I know that…

jklmnn
- 481
- 1
- 5
- 11
5
votes
4 answers
unrolling for loops in a special case function
So I'm trying to optimize some code. I have a function with a variable sized loop. However for efficiency sake I want to make cases with 1, 2 and 3 sized loops special cases that are completely unrolled. My approach so far is to declare the loop…

p clark
- 51
- 1
5
votes
1 answer
Loop unrolling in Metal kernels
I need to force the Metal compiler to unroll a loop in my kernel compute function. So far I've tried to put #pragma unroll(num_times) before a for loop, but the compiler ignores that statement.
It seems that the compiler doesn't unroll the loops…

sarasvati
- 792
- 12
- 30
5
votes
2 answers
Porting duff's device from C to JavaScript
I have this kind of Duff's device in C and it works fine (format text as money):
#include
#include
char *money(const char *src, char *dst)
{
const char *p = src;
char *q = dst;
size_t len;
len = strlen(src);
…

David Ranieri
- 39,972
- 7
- 52
- 94
5
votes
3 answers
does unrolling loops in x86-64 actually make code faster?
I assume everyone knows what "unrolling loops means". Just in case I'll give a concrete example in a moment. The question I will ask is... does unrolling loops in x86-64 assembly language actually make code faster? I will explain why I begin to…

honestann
- 1,374
- 12
- 19
5
votes
2 answers
C++ Loop Unrolling Performance Difference (Project Euler)
I have a question about a Project Euler question and optimization using loop unrolling.
Problem description:
2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder. What is the smallest positive…

Blaine
- 61
- 3
4
votes
1 answer
Does compiler only unroll the outer loop completely?
I try to compile this code and use loop-specific pragmas to tell the compiler how many times to unroll a counted loop.
#include
int main() {
std::vector v(8192);
#pragma GCC unroll 8 // 16
for (int i = 0; i < 16; i++) {
for…

Cache
- 45
- 7
4
votes
2 answers
Unrolling For Loop in C
I am trying to unroll this loop by a factor of 2.
for(i=0; i<100; i++){
x[i] = y[i] + z[i];
z[i] = y[i] + a[i];
z[i+1] = y[i] * a[i];
}
I have it unrolled to:
for(i=0; i<100; i+=2){
x[i] = y[i] + z[i];
x[i+1] = y[i+1] + z[i+1];
z[i]…

programminglearner
- 552
- 4
- 18
4
votes
1 answer
Loop unrolling - G++ vs. Clang++
I was wondering whether it is worth to aid the compiler with templates to unroll a simple loop. I prepared the following test:
#include
#include
#include
class TNode
{
public:
void Assemble();
void Assemble(TNode…

metalfox
- 6,301
- 1
- 21
- 43
4
votes
1 answer
Manual loop unrolling with known maximum size
Please take a look at this code in an OpenCL kernel:
uint point_color = 4278190080;
float point_percent = 1.0f;
float near_pixel_size = (...);
float far_pixel_size = (...);
float delta_pixel_size = far_pixel_size - near_pixel_size;
float3 near =…

Izhido
- 392
- 3
- 13
4
votes
4 answers
GCC inline asm NOP loop not being unrolled at compile time
Venturing out of my usual VC++ realm into the world of GCC (via MINGW32). Trying to create a Windows PE that consists largely of NOPs, ala:
for(i = 0; i < 1000; i++)
{
asm("nop");
}
But either I'm using the wrong syntax or the compiler is…

Rushyo
- 7,495
- 4
- 35
- 42