How can I convert C++ code to assembly using the SSE instruction set?

Asked Sep 13 '22 at 13:24

Active Sep 13 '22 at 17:30

Viewed 118 times

How can I convert the below code to assembly using the SSE instruction set?

for (int &elem : elems){
    int temp = 255 - elem > 0 ? 255 - elem : 0 ;
    results.push_back(temp);
}

I don't want to use intrinsic C++ functions.

I can't understand how to pass elems to assembly, and how to work on multiple values in parallel using the SSE instructions set.

edited Sep 13 '22 at 17:30

Remy Lebeau

asked Sep 13 '22 at 13:24

5

just give compiler a hint what kind of architecture of processor it can use: `-march=....` (for gcc/clang) and enable optimizations `-O2`. – Marek R Sep 13 '22 at 13:29
3

[Demo](https://godbolt.org/z/oYMTafT36) – Marek R Sep 13 '22 at 13:39
2

@MarekR it only half worked, GCC used AVX2 there but only for one `int` at the time – harold Sep 13 '22 at 15:21
2

Stop using `.push_back` inside your loop if you want the compiler to vectorize, like we said in comments on [your last question about this](https://stackoverflow.com/questions/73698461/how-to-load-array-elements-in-mmx-or-sse-registers-to-do-sum-operation-on-them). Also, you need `-O3` for full vectorization; `-O2` enables vectorization only in very easy cases. Why does this need to be conditional? I thought your input elements were in the 0..255 range (so you could pack them into 8-bit elements and get 4x the work done per SIMD vector). – Peter Cordes Sep 13 '22 at 16:34

0 Answers0