Questions tagged [loop-unrolling]

Loop unrolling is a loop optimization strategy.

164 questions
0
votes
1 answer

How to unroll a loop of a dot product in mips after re-ordering instructions?

I got this question about loop unrolling in mips but I could not figure out how once I got to the step I will show you below, I am not even sure about this steps. I am new to computer Arch, I just had this code snippet which is in assembly: Loop: ld…
0
votes
1 answer

Loop unrolling with multidimensional arrray using MIPS 64

I'm preparing for a university exam, the course talks about Calculator, specifically the MIPS 64, I'll get to the point, an exercise ask me the use the loop unrolling using multidimensional arrays, however, I'm able to handle the exercise as long as…
Antonio
  • 21
  • 5
0
votes
0 answers

Unroll for loop during build time in JS React with Webpack

We are using react-router-dom in our project, so we need to generate a Router with routes. But also we have a custom navigation bar where user can click and go on page. Right now when we need to add or remove new pages we need to update minimum of 2…
Marko Taht
  • 1,448
  • 1
  • 20
  • 40
0
votes
1 answer

Does unrolling a loop affect the accuracy of the computations within?

Summarized question Does unrolling a loop affect the accuracy of the computations performed within the loop? And if so, why? Elaboration and background I am writing a compute shader using HLSL for use in a Unity-project (2021.2.9f1). Parts of my…
0
votes
0 answers

Avx loop unrolling

I generate high performance loop in runtime which for example sums two array. I want to unroll my loop. Which sequence of operations inside loop should I choose: a. Load as many data as possible (constrained by number of ymm registers) b. Process…
Yuriy
  • 377
  • 1
  • 2
  • 10
0
votes
1 answer

last warp loop unrolling in Nvidia's parallel reduction tutorial problem

I ran into a problem for understanding the logic behind "the last warp loop unrolling" technique in Nvidia's parallel reduction tutorial available here. In case of thread31 (for which tid=31), before unrolling the loop: this thread only executes…
0
votes
1 answer

How does the warp loop unrolling work in Harris' Parallel Reduction tutorial?

I am following the reduction in CUDA presentation by Mark Harris. I've gotten to optimization step #5 and I am confused by the main logic of warpReduce() function: __device__ void warpReduce(volatile int* sdata, int tid) { sdata[tid] += sdata[tid…
kingwales
  • 129
  • 8
0
votes
0 answers

Loop Unrolling Using Nested Loops and If/Else Statements 10 x 10 Unrolling

So,I have two loops that I would like to attempt to do a 10 x 10 unrolling on. I really have never done this. I have seen some simple examples that did not involve if/else statements or nested loops. So I am kind of at a loss how to do this for…
terrylt1
  • 67
  • 7
0
votes
2 answers

Portable loop unrolling with template parameter in C++ with GCC/ICC

I am working on a high-performance parallel computational fluid dynamics code that involves a lot of lightweight loops and therefore gains approximately 30% in performance if all important loops are fully unrolled. This can be done easily for a…
2b-t
  • 2,414
  • 1
  • 10
  • 18
0
votes
2 answers

Unrolling nested loops c++

I'm trying to unroll a nested loop that stores data in a 2D dynamic memory allocation in C++. Although, I'm not quite sure how to do it. Here is my original loop before unrolling: int steps[1]; Ipp32f* vectx = ippiMalloc_32f_C1(size0, size1,…
user11512155
0
votes
1 answer

loop unrolling for nested for loops in C

I originally have this function, and I am trying to optimize it further using loop unrolling - which I am having a trouble with - flipping the for loops increase the efficiency, as well as getting the calls outside the loops. However, when it comes…
Kiro Reda
  • 1
  • 1
0
votes
1 answer

Convert function to Arm Neon

I'm a beginner in Arm Neon, and I'm trying to vectorise this loop float ans=0.0; for (i=0; i
Snake91
  • 21
  • 1
  • 4
0
votes
1 answer

Why does program achieve the throughput bound given an unrolling factor k>C*L?

I'm reading Computer Systems: A Programmer's Perspective, 3/E (CS:APP3e) Randal E. Bryant and David R. O'Hallaron. The author says: "In general, a program can achieve the throughput bound for an operation only when it can keep the pipelines filled…
0
votes
1 answer

How loop unrolling can cause cache misses

I have read (on Wikipedia) that loop unrolling can cause instruction cache misses but I don't understand how. From my understanding, if the loop is unrolled or not, it will still execute the same instructions with just the difference that the…
Saleem
  • 92
  • 2
  • 12
0
votes
0 answers

GLSL ES does not behave the same way with for loop and unrolled loop

It seems GLSL ES 3.0 does not execute properly my code. I wrote twice the same code, first in an unrolled manner ; and second with a for loop : // Unrolled version: float minDistance = 1000000.0; vec3 finalColor1 = vec3(0.0); int…
arthur.sw
  • 11,052
  • 9
  • 47
  • 104