8

I am using SSE2 intrinsics to optimize the bottlenecks of my application and have the following question:

ddata = _mm_xor_si128(_mm_xor_si128(
    _mm_sll_epi32(xdata, 0x7u), _mm_srl_epi32(tdata, 0x19u)), xdata);

On Microsoft C++ Compiler this won't compile because types __m128i and unsigned int (passed to _mm_sll_epi32 instruction) are not interchangeable.

Why is this so and how should I pass the arbitrary unsigned int value to _mm_sll_epi32?


_m128i is:

typedef union __declspec(intrin_type) _CRT_ALIGN(16) __m128i {
    __int8              m128i_i8[16];
    __int16             m128i_i16[8];
    __int32             m128i_i32[4];    
    __int64             m128i_i64[2];
    unsigned __int8     m128i_u8[16];
    unsigned __int16    m128i_u16[8];
    unsigned __int32    m128i_u32[4];
    unsigned __int64    m128i_u64[2];
} __m128i;
Paul R
  • 208,748
  • 37
  • 389
  • 560
Yippie-Ki-Yay
  • 22,026
  • 26
  • 90
  • 148

2 Answers2

11

It should be:

ddata = _mm_xor_si128(_mm_xor_si128(
    _mm_slli_epi32(xdata, 0x7), _mm_srli_epi32(tdata, 0x19)), xdata);

Note the i for "immediate". Without this the shift intrinsics expects a vector as the second argument.

Paul R
  • 208,748
  • 37
  • 389
  • 560
6

You can use _mm_slli_epi32 (note the i) and likewise _mm_srli_epi32. It takes an integer argument rather than an __m128i.

user7116
  • 63,008
  • 17
  • 141
  • 172