Questions tagged [vector-processing]

10 questions
23
votes
3 answers

Fastest way to do horizontal vector sum with AVX instructions

I have a packed vector of four 64-bit floating-point values. I would like to get the sum of the vector's elements. With SSE (and using 32-bit floats) I could just do the following: v_sum = _mm_hadd_ps(v_sum, v_sum); v_sum = _mm_hadd_ps(v_sum,…
Luigi Castelli
  • 676
  • 2
  • 6
  • 13
21
votes
2 answers

How to vectorize with gcc?

The v4 series of the gcc compiler can automatically vectorize loops using the SIMD processor on some modern CPUs, such as the AMD Athlon or Intel Pentium/Core chips. How is this done?
19
votes
4 answers

How to find the horizontal maximum in a 256-bit AVX vector

I have a __m256d vector packed with four 64-bit floating-point values. I need to find the horizontal maximum of the vector's elements and store the result in a double-precision scalar value; My attempts all ended up using a lot of shuffling of the…
Luigi Castelli
  • 676
  • 2
  • 6
  • 13
5
votes
2 answers

Auto-vectorizing vs. vectorized code by hand

Is it better in some sense to vectorize code by hand, using explicit pragmas or to rely on or use auto-vectorization? For optimum performance using auto-vectorization, one would have to monitor the compiler output to ensure that loops are being…
casualcoder
  • 4,770
  • 6
  • 29
  • 35
4
votes
9 answers

What compilers besides gcc can vectorize code?

GCC can vectorize loops automatically when certain options are specified and given the right conditions. Are there other compilers widely available that can do the same?
casualcoder
  • 4,770
  • 6
  • 29
  • 35
2
votes
2 answers

Difference between a vector and an array processor

Can someone please explain the difference between a vector and an array processor, which one encounters when learning about the computer architecture involved in parallel programming? One of the source which I referred tells that the vector…
Aim
  • 389
  • 6
  • 20
2
votes
2 answers

Pluggable vector processing units in Clojure

I'm developing some simulation software in Clojure that will need to process lots of vector data (basically originating as offsets into arrays of Java floats, length typically in 10-10000 range). Large numbers of these vectors will need to go…
mikera
  • 105,238
  • 25
  • 256
  • 415
1
vote
1 answer

Can VPP plugins be implemented using Go?

VPP provides the I/S for developing custom plugins that can be hooked into a graph of nodes. I've only seen examples for such plugins written in the C language, and was wondering whether other language, Go for instance, can also be used to write…
omer
  • 1,242
  • 4
  • 18
  • 45
0
votes
0 answers

VPP Host Stack Udp Data Stream

I'am newbee to VPP. I have to implement udp connection between VPPs. Business logic is going to be like, App1 <--> VPP1 memif <--> NIC <----> UDP packets <----> NIC <--> VPP2 Memif <--> App2 What I have done so far I have connected App1 and VPP1…
Mustafa
  • 147
  • 1
  • 12
0
votes
1 answer

Is it possible to find the max vector length of the vector processor in Fortran?

Is it possible to test in Fortran if the processor is vectorial and find out the max length the vector? I checked the cpuinfo as listed below processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 63 model name : Intel(R) Xeon(R)…
Shiyu
  • 110
  • 9