Questions tagged [vector-processing]
10 questions
23
votes
3 answers
Fastest way to do horizontal vector sum with AVX instructions
I have a packed vector of four 64-bit floating-point values.
I would like to get the sum of the vector's elements.
With SSE (and using 32-bit floats) I could just do the following:
v_sum = _mm_hadd_ps(v_sum, v_sum);
v_sum = _mm_hadd_ps(v_sum,…

Luigi Castelli
- 676
- 2
- 6
- 13
21
votes
2 answers
How to vectorize with gcc?
The v4 series of the gcc compiler can automatically vectorize loops using the SIMD processor on some modern CPUs, such as the AMD Athlon or Intel Pentium/Core chips. How is this done?

casualcoder
- 4,770
- 6
- 29
- 35
19
votes
4 answers
How to find the horizontal maximum in a 256-bit AVX vector
I have a __m256d vector packed with four 64-bit floating-point values.
I need to find the horizontal maximum of the vector's elements and store the result in a double-precision scalar value;
My attempts all ended up using a lot of shuffling of the…

Luigi Castelli
- 676
- 2
- 6
- 13
5
votes
2 answers
Auto-vectorizing vs. vectorized code by hand
Is it better in some sense to vectorize code by hand, using explicit pragmas or to rely on or use auto-vectorization? For optimum performance using auto-vectorization, one would have to monitor the compiler output to ensure that loops are being…

casualcoder
- 4,770
- 6
- 29
- 35
4
votes
9 answers
What compilers besides gcc can vectorize code?
GCC can vectorize loops automatically when certain options are specified and given the right conditions. Are there other compilers widely available that can do the same?

casualcoder
- 4,770
- 6
- 29
- 35
2
votes
2 answers
Difference between a vector and an array processor
Can someone please explain the difference between a vector and an array processor, which one encounters when learning about the computer architecture involved in parallel programming?
One of the source which I referred tells that the vector…

Aim
- 389
- 6
- 20
2
votes
2 answers
Pluggable vector processing units in Clojure
I'm developing some simulation software in Clojure that will need to process lots of vector data (basically originating as offsets into arrays of Java floats, length typically in 10-10000 range). Large numbers of these vectors will need to go…

mikera
- 105,238
- 25
- 256
- 415
1
vote
1 answer
Can VPP plugins be implemented using Go?
VPP provides the I/S for developing custom plugins that can be hooked into a graph of nodes. I've only seen examples for such plugins written in the C language, and was wondering whether other language, Go for instance, can also be used to write…

omer
- 1,242
- 4
- 18
- 45
0
votes
0 answers
VPP Host Stack Udp Data Stream
I'am newbee to VPP. I have to implement udp connection between VPPs.
Business logic is going to be like,
App1 <--> VPP1 memif <--> NIC <----> UDP packets <----> NIC <--> VPP2 Memif <--> App2
What I have done so far I have connected App1 and VPP1…

Mustafa
- 147
- 1
- 12
0
votes
1 answer
Is it possible to find the max vector length of the vector processor in Fortran?
Is it possible to test in Fortran if the processor is vectorial and find out the max length the vector?
I checked the cpuinfo as listed below
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R)…

Shiyu
- 110
- 9