3

I am working with SSE2 instructions on VS2013 and I realized that some functions in the Intel documentation are missing from the header they are supposed to be in.

The method void _mm_storeu_si32 (void* mem_addr, __m128i a) should be in #include <immintrin.h> but it is not. I do have access to other methods from this header though like __m128d _mm_undefined_pd (void) for example.

If I do search in the header file itself (delivered by VS2013), there is indeed no mention about the _mm_storeu_si32 instruction.

How can this be ?

Norgannon
  • 487
  • 4
  • 16
  • 2
    I haven't studied this extensively, so that's why this is a comment and not an answer: I would expect the linked intrinsic list to be of intrinsics of Intel's own C++ compiler ICC. For VS you can find the list here: https://learn.microsoft.com/en-us/cpp/intrinsics/x86-intrinsics-list?view=vs-2019 although be warned that the list is updated with each VS edition, so your old edition can have even less that what you see now in the list. – bolov Sep 23 '19 at 14:51
  • I think you got it right ! I thought SSE2 support meant that all instructions from the Intel Intrisics are available in SSE2 capable compilers but it does not seem to be the case. If I do understand properly, these instructions are use cases of usage of low level hardware instructions (in my `_mm_storeu_si32` example it is the hardware instructions `movd, m32 & xmm`). The higher level `_mm_storeu_si32` is just an optional use case that compilers might or might not choose to implement. That would make sense. – Norgannon Sep 23 '19 at 15:03
  • It's the same for GCC and Clang: they don't always provide every intrinsic name that Intel documents, especially when there are multiple intrinsic names for the same underlying instruction or cast or whatever. (And sometimes there's just missing functionality, especially for intrinsics that aren't for a SIMD computation instruction.) – Peter Cordes Sep 23 '19 at 19:44

1 Answers1

3

In the old off-line intrinsics guide _mm_storeu_si32 was listed under the 'other' section. Now, in the online intrinsics guide it is listed under SSE2, but not all compilers have implemented it yet. As a portable work around (store_b) you can use:

#include<immintrin.h>
void storeu_a(void* mem_addr, __m128i a) {
    _mm_storeu_si32(mem_addr, a);
    return;
}

void storeu_b(void* mem_addr, __m128i a) {
    _mm_store_ss((float*)mem_addr, _mm_castsi128_ps(a));
    return;
}

With clang this compiles to identical code, but other compiler may choose movd instead of movss with store_a and/or store_b:

storeu_a(void*, long long __vector(2)):                     # @storeu_a(void*, long long __vector(2))
        movss   dword ptr [rdi], xmm0
        ret
storeu_b(void*, long long __vector(2)):                     # @storeu_b(void*, long long __vector(2))
        movss   dword ptr [rdi], xmm0
        ret
wim
  • 3,702
  • 19
  • 23
  • 1
    Note that the ``_mm_storeu_si32`` intrinsic is defined in ``immintrin.h`` in VS 2017 or later. – Chuck Walbourn Sep 23 '19 at 15:51
  • @wim Thanks a lot for the workaround, that does seem to work as intended ! :D – Norgannon Sep 24 '19 at 08:01
  • @ChuckWalbourn Thanks for this information ! It's odd that the intrinsic isn't even in the [updated 2019 online documentation](https://learn.microsoft.com/en-us/cpp/intrinsics/x64-amd64-intrinsics-list?view=vs-2019) of Microsoft if they did in fact implement it from VS17 and forward. – Norgannon Sep 24 '19 at 08:03
  • You can file a pull request or issue against the master docs depot on [GitHub](https://github.com/MicrosoftDocs/cpp-docs/blob/master/docs/intrinsics/x64-amd64-intrinsics-list.md) to get that fixed... – Chuck Walbourn Sep 24 '19 at 15:54