Questions tagged [sse3]

SSE3, Streaming Single Instruction Multiple Data Extensions 3, is the third iteration of the SSE instruction set for the (x86) architecture.

SSE3, Streaming Single Instruction Multiple Data Extensions 3, also known by its Intel code name Prescott New Instructions (PNI), is the third iteration of the SSE instruction set for the IA-32 (x86) architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU. In April 2005, AMD introduced a subset of SSE3 in revision E (Venice and San Diego) of their Athlon 64 CPUs.

28 questions
2
votes
2 answers

Compiling a gnu program without sse3

I'm compiling an app for a device where the architecture does not support sse beyond sse2, and was wondering is it possible to disable compiling with sse3 instructions from GNU autoconf generated configure scripts? I know you can turn it off in…
Charles Ma
  • 47,141
  • 22
  • 87
  • 101
2
votes
1 answer

ROS (Robot Operating System) with SSSE3 flag

I started working with ROS lately and got stuck on one problem. I need to use some classes whick require SSE2, SSE3 and SSSE3 CPU extensions. I tried to edit the manifest.xml file of my ROS Package like
SolvedForHome
  • 152
  • 1
  • 15
2
votes
1 answer

Should I compile with -mssse3 in the presence of ASM SSSE3 code?

I have a question regarding compiling a build of x264 on GCC. x264 has assembly code dealing with instruction sets such as SSE3 and SSSE3 and by default has auto-vectorization disabled in the makefile. Should I compile it with the -mssse3 flag…
Philos
  • 37
  • 4
2
votes
1 answer

Sum of the four 32bits elements of a _m128 vector

I'm using intrinsics to optimize a program of mine. But now I would like to sum the four elements that are in a __m128 vector in order to compare the result to a floating point value. For instance, let's say I have this 128 bits vector : {a, b c,…
Merkil
  • 23
  • 3
1
vote
1 answer

Really basic SSE

I have a very simple program that I am trying to improve performance. One way that I know will help is to utilize SSE3 (since the machine that I am working supports this), but I have absolutely no idea how to to do this. Here is a code snippet…
AndroidDev
  • 20,466
  • 42
  • 148
  • 239
1
vote
0 answers

Processor optimization flags in OpenCV

I'm building an application that uses OpenCV that will run on a variety of Windows computers (using Win7, Win8, Win10). Now I have discovered that my application crashes randomly at some computers. After a lot of googling I have realized that…
4-bit
  • 230
  • 3
  • 13
1
vote
0 answers

MinGW error Type '__m128i' could not be resolved in eclipse

In eclipse with MinGW I am trying to compile c code having some Intel Intrinsic Instruction (sse2 sse3). I have given compiler option -march=native -msse2 -msse3 -mssse3 -msse4.1 but I am getting an error Type '__m128i' could not be…
Mohan
  • 1,871
  • 21
  • 34
1
vote
1 answer

converting from int to (16-bit) __m128i

I have the following code as a part of a program, but when I compile it I get the following error: cannot convert ‘int’ to ‘__m128i {aka __vector(2) long long int}’ in assignment Where the code is: int t; int s; __m128i *array; __m128i…
MROF
  • 147
  • 1
  • 3
  • 9
1
vote
1 answer

Reading file with "gaps" in destination array

I'm trying to find a way to read a file into an array with "gaps": So the read data is in the byte array buffer at the positions buffer[0], buffer[2], .., buffer[2*i], without any significant speed loss. More specifically I want to read it int-wise…
Arokh
  • 614
  • 1
  • 10
  • 18
0
votes
0 answers

SSE for 2D arrays

I want to change the following code using SSE3 instructions: for (i=0; i<=imax+1; i++) { /* The vertical velocity approaches 0 at the north and south * boundaries, but fluid flows freely in the horizontal direction */ …
0
votes
0 answers

Adding horizontally with SSE3

I am trying to write a simple code using SSE and SSE3 to calculate the sum of all the elements of an array. The difference is that in one of the codes I do the sum "vertically" using PADDD and in the other I do the sum horizontally, using HADPPS.…
julix
  • 13
  • 5
0
votes
1 answer

How to use vectors in assembly code x86 and SSE

I don't know how to access a stl vector in x86. I have tried to do it like that but I have some errors. mov ebx, stl_vector mov eax, [ebx] ;Here I want to store the first element of the vector mov edx, [ebx + 4] ; I want to store the second element…
0
votes
1 answer

AVX and Bubble Sort

I have to develop a bubble sort algorithm with AVX instructions with single precision numbers in input. Can anyone help me to look for the best implementation? I did a bubble sort version for SSE3: global sort32 sort32: start mov eax, [ebp+8] …
Frank
  • 730
  • 2
  • 9
  • 20
1
2