Questions tagged [loop-unrolling]

Loop unrolling is a loop optimization strategy.

164 questions
4
votes
2 answers

SSE Intrinsics and loop unrolling

I am attempting to optimise some loops and I have managed but I wonder if I have only done it partially correct. Say for example that I have this loop: for(i=0;i
Kieran Lavelle
  • 446
  • 1
  • 6
  • 22
4
votes
1 answer

Can c++ loops be guaranteed by the compiler (gcc)?

I have to make the following AVX operations: __m256 perm, func; __m256 in = _mm256_load_ps(inPtr+x); __m256 acc = _mm256_setzero_ps(); perm = _mm256_shuffle_ps(in, in, _MM_SHUFFLE(3,2,1,0)); func = _mm256_load_ps(fPtr+0); acc = _mm256_add_ps(acc,…
galinette
  • 8,896
  • 2
  • 36
  • 87
4
votes
1 answer

simd vectorlength and unroll factor for fortran loop

I want to vectorize the fortran below with SIMD directives !DIR$ SIMD DO IELEM = 1 , NELEM X(IKLE(IELEM)) = X(IKLE(IELEM)) + W(IELEM) ENDDO And I used the instruction avx2. The program is compiled by ifort main_vec.f -simd -g -pg -O2…
Shiyu
  • 110
  • 9
4
votes
2 answers

Java can recognize SIMD advantages of CPU; or there is just optimization effect of loop unrolling

This part of code is from dotproduct method of a vector class of mine. The method does inner product computing for a target array of vectors(1000 vectors). When vector length is an odd number(262145), compute time is 4.37 seconds. When vector…
huseyin tugrul buyukisik
  • 11,469
  • 4
  • 45
  • 97
3
votes
1 answer

Any options that enable loop inversion in LLVM?

Are there any options that enable loop inversion? More specifically,can LLVM transform the while form loop into do-while form loop as the following. Before the transformation, the code is: void foo(unsigned a, unsigned& ret) { bool undone = true; …
shu
  • 31
  • 2
3
votes
1 answer

How to iterate over a compile-time seq in a manner that unrolls the loop?

I have a sequence of values that I know at compile-time, for example: const x: seq[string] = @["s1", "s2", "s3"] I want to loop over that seq in a manner that keeps the variable a static string instead of a string as I intend to use these strings…
3
votes
0 answers

Loop unroll issue with Visual Studio compiler

I have some simple setup, where I noticed that VS compiler seems not smart enough to unroll loop, but other compilers like clang or gcc do so. Do I miss some optimization flag for VS? #include struct A { double data[4]; double…
3
votes
1 answer

Loop unrolling? in Julia with metaprogramming

is there a way to "metaprogrammatically" obtain a block of code with the following structure: if r1 < R1 s = 1 elseif r1 < R2 s = 2 ... etc until N end end Thanks!
epx
  • 799
  • 4
  • 12
3
votes
2 answers

Determining the optimal value for #pragma unroll N in CUDA

I understand how #pragma unroll works, but if I have the following example: __global__ void test_kernel( const float* B, const float* C, float* A_out) { int j = threadIdx.x + blockIdx.x * blockDim.x; if (j < array_size) { #pragma unroll …
Blizzard
  • 1,117
  • 2
  • 11
  • 28
3
votes
1 answer

Pros/cons of different methods of loop unrolling using template metaprogramming

I'm interested in general solutions for loop unrolling at compile time (I'm using this in a SIMD setting where each function call takes a specific number of clock cycles and multiple calls can be performed in parallel, so I need to tune the number…
j_h
  • 103
  • 1
  • 7
3
votes
1 answer

Decrease in instructions retired after loop Unrolling

I have a O(N^4) image processing loop and after profiling it (Using Intel Vtune 2013), I see that the number of Instructions retired is reduced drastically. I need help understanding this behavior on a multicore architecture. (I'm using Intel Xeon…
3
votes
0 answers

loop unrolling for matrix multiplication

I need to make a good implementation for matrix multiplication better than the naive method here is the methods i used : 1- removed false dependencies which made the performance a lot better 2- used a recursive approach and then there is…
3
votes
3 answers

Is there an optimization similar to loop unroll for functional programming?

Disclaimer: I know little about ghc compiling pipeline, but I hope to learn some more about it with this post, for example, if comparing imperative vs functional is relevant to code compilation. As you know, loop unrolling reduces the number of…
MdxBhmt
  • 1,310
  • 8
  • 16
3
votes
3 answers

Unrolling javascript loops

I have a cubic 3D array "class" like this: function Array3D(size) { this.data = new Array(size*size*size); var makeIndex = function(p) { return p[0] + p[1]*size + p[2]*size*size; } this.get = function(p) { return…
Eric
  • 95,302
  • 53
  • 242
  • 374
2
votes
2 answers

Counted/Uncounted loops and Safepoints - is `while (++i < someInt)` considered uncounted loop?

I was reading this post on Counted/Uncounted loops and Safepoints. What it tells me is that there will be safepoint polls for Uncounted loop, meaning Uncounted loop has worse performance than Counted loop. In the blog there's this intersting…
wayne
  • 598
  • 3
  • 15