Loop unrolling is a loop optimization strategy.
Questions tagged [loop-unrolling]
164 questions
4
votes
2 answers
SSE Intrinsics and loop unrolling
I am attempting to optimise some loops and I have managed but I wonder if I have only done it partially correct. Say for example that I have this loop:
for(i=0;i

Kieran Lavelle
- 446
- 1
- 6
- 22
4
votes
1 answer
Can c++ loops be guaranteed by the compiler (gcc)?
I have to make the following AVX operations:
__m256 perm, func;
__m256 in = _mm256_load_ps(inPtr+x);
__m256 acc = _mm256_setzero_ps();
perm = _mm256_shuffle_ps(in, in, _MM_SHUFFLE(3,2,1,0));
func = _mm256_load_ps(fPtr+0);
acc = _mm256_add_ps(acc,…

galinette
- 8,896
- 2
- 36
- 87
4
votes
1 answer
simd vectorlength and unroll factor for fortran loop
I want to vectorize the fortran below with SIMD directives
!DIR$ SIMD
DO IELEM = 1 , NELEM
X(IKLE(IELEM)) = X(IKLE(IELEM)) + W(IELEM)
ENDDO
And I used the instruction avx2. The program is compiled by
ifort main_vec.f -simd -g -pg -O2…

Shiyu
- 110
- 9
4
votes
2 answers
Java can recognize SIMD advantages of CPU; or there is just optimization effect of loop unrolling
This part of code is from dotproduct method of a vector class of mine. The method does inner product computing for a target array of vectors(1000 vectors).
When vector length is an odd number(262145), compute time is 4.37 seconds. When vector…

huseyin tugrul buyukisik
- 11,469
- 4
- 45
- 97
3
votes
1 answer
Any options that enable loop inversion in LLVM?
Are there any options that enable loop inversion? More specifically,can LLVM transform the while form loop into do-while form loop as the following.
Before the transformation, the code is:
void foo(unsigned a, unsigned& ret) {
bool undone = true;
…

shu
- 31
- 2
3
votes
1 answer
How to iterate over a compile-time seq in a manner that unrolls the loop?
I have a sequence of values that I know at compile-time, for example: const x: seq[string] = @["s1", "s2", "s3"]
I want to loop over that seq in a manner that keeps the variable a static string instead of a string as I intend to use these strings…

Philipp Doerner
- 1,090
- 7
- 24
3
votes
0 answers
Loop unroll issue with Visual Studio compiler
I have some simple setup, where I noticed that VS compiler seems not smart enough to unroll loop, but other compilers like clang or gcc do so. Do I miss some optimization flag for VS?
#include
struct A
{
double data[4];
double…

Feuerteufel
- 571
- 5
- 16
3
votes
1 answer
Loop unrolling? in Julia with metaprogramming
is there a way to "metaprogrammatically" obtain a block of code with the following structure:
if r1 < R1
s = 1
elseif r1 < R2
s = 2
... etc until N
end
end
Thanks!

epx
- 799
- 4
- 12
3
votes
2 answers
Determining the optimal value for #pragma unroll N in CUDA
I understand how #pragma unroll works, but if I have the following example:
__global__ void
test_kernel( const float* B, const float* C, float* A_out)
{
int j = threadIdx.x + blockIdx.x * blockDim.x;
if (j < array_size) {
#pragma unroll
…

Blizzard
- 1,117
- 2
- 11
- 28
3
votes
1 answer
Pros/cons of different methods of loop unrolling using template metaprogramming
I'm interested in general solutions for loop unrolling at compile time (I'm using this in a SIMD setting where each function call takes a specific number of clock cycles and multiple calls can be performed in parallel, so I need to tune the number…

j_h
- 103
- 1
- 7
3
votes
1 answer
Decrease in instructions retired after loop Unrolling
I have a O(N^4) image processing loop and after profiling it (Using Intel Vtune 2013), I see that the number of Instructions retired is reduced drastically. I need help understanding this behavior on a multicore architecture. (I'm using Intel Xeon…

quantumshiv
- 97
- 10
3
votes
0 answers
loop unrolling for matrix multiplication
I need to make a good implementation for matrix multiplication better than the naive method
here is the methods i used :
1- removed false dependencies which made the performance a lot better
2- used a recursive approach
and then there is…

Ahmed Abdel Moneim Elket
- 177
- 7
3
votes
3 answers
Is there an optimization similar to loop unroll for functional programming?
Disclaimer: I know little about ghc compiling pipeline, but I hope to learn some more about it with this post, for example, if comparing imperative vs functional is relevant to code compilation.
As you know, loop unrolling reduces the number of…

MdxBhmt
- 1,310
- 8
- 16
3
votes
3 answers
Unrolling javascript loops
I have a cubic 3D array "class" like this:
function Array3D(size) {
this.data = new Array(size*size*size);
var makeIndex = function(p) {
return p[0] + p[1]*size + p[2]*size*size;
}
this.get = function(p) { return…

Eric
- 95,302
- 53
- 242
- 374
2
votes
2 answers
Counted/Uncounted loops and Safepoints - is `while (++i < someInt)` considered uncounted loop?
I was reading this post on Counted/Uncounted loops and Safepoints.
What it tells me is that there will be safepoint polls for Uncounted loop, meaning Uncounted loop has worse performance than Counted loop.
In the blog there's this intersting…

wayne
- 598
- 3
- 15