1

I am using Intel Intrinsics and getting this odd error.

src/header/header.c:18:3: error: can’t convert value to a vector
       18 |   int has_value = (int)_mm_cmpestrc(buffer, 4, u_str.vec, 4,
          |   ^~~

I have tried the below without the (int) cast, i have tried with <nmmintrin.h> as well

#include "./header.h"
#ifdef __SIMD__
#include <x86intrin.h>
#endif

static inline void parse_with_simd(const char *buffer, const int buffer_len) {
  union {
    __m128i vec;
    char * str;
  } u_str = {.str = "GET "};
  int has_value = (int)_mm_cmpestrc(buffer, 4, u_str.vec, 4,
                               _SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_EACH); // <-- this line

My CPPFLAGS and CFLAGS

CFLAGS = -Wall -O0 -std=c11 -g
CPPFLAGS = -DDEBUG -D__SIMD__

When I look at the definition of _mm_cmpstrc it shows the return type is an int too!

#define _mm_cmpestrc(A, LA, B, LB, M) \
  (int)__builtin_ia32_pcmpestric128((__v16qi)(__m128i)(A), (int)(LA), \
                                    (__v16qi)(__m128i)(B), (int)(LB), \
                                    (int)(M))

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Your first argument to the `_mm_cmpestrc` call is a `char*` - this is **very** different from the type specified in [the documentation](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_cmpestrc), where it is an `__m128i`. Also note that error messages in macro expansions can point to strange places. – Adrian Mole Aug 22 '21 at 19:49
  • ... using MSVC (and the `nmmintrin.h` header), I get this: *error C2440: 'function': cannot convert from 'const char *' to '__m128i'* – Adrian Mole Aug 22 '21 at 19:54
  • @AdrianMole I added a cast from `(__m128i)buffer` and same issue. My understanding is they are both pointers to string data. – Christopher Clark Aug 22 '21 at 19:55
  • The `__m128i` type is (most likely) implemented by your C compiler as a `struct` or `union`. You can't cast a `char*` to a `struct`. Not sure what you're trying to do, so can't really offer much help. – Adrian Mole Aug 22 '21 at 19:57
  • This [Microsoft document](https://learn.microsoft.com/en-us/cpp/cpp/m128i?view=msvc-160) *may* offer some help. – Adrian Mole Aug 22 '21 at 20:01
  • You're passing a *pointer* to the data with `buffer`, but the intrinsic is expecting the data itself. A load has to take place; no amount of casting will accomplish that. I think you need one of the `load` intrinsics. – Nate Eldredge Aug 22 '21 at 20:05
  • Replace 'char * str;' with 'char str[4]' – tstanisl Aug 22 '21 at 20:05
  • @tstanisl You would need *at least* `char str[5]` to hold the given string literal but more likely a power-of-2 size. But that does nothing to address the problem with the first argument. – Adrian Mole Aug 22 '21 at 20:15
  • 1
    @AdrianMole, maybe it should be `char str[sizeof(__m128i)]` instead. The `_mm_cmpestrc` doc says that the string must be embedded into the vector, not a pointer to a string. – tstanisl Aug 22 '21 at 20:22

1 Answers1

2

The instruction requires the content of the string to be put to the vector. Not a pointer to the string. Using memcpy is probably easiest way to achieve it.


static inline void parse_with_simd(const char *buffer, const int buffer_len) {
  __m128i a, b;
  // requires buffer_len be at most 16
  memcpy(&a, buffer, buffer_len);
  memcpy(&b, "GET ", 5);
  int has_value = _mm_cmpestrc(a, 4, b, 4, _SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_EACH);
  ...
}
tstanisl
  • 13,520
  • 2
  • 25
  • 40
  • Variable-sized memcpy that might leave some of `a` uninitialized is probably slow and seems like a poor choice. If you allocate `buffer` such that it's [guaranteed not to be within 16 bytes of an unmapped page](//stackoverflow.com/questions/37800739/is-it-safe-to-read-past-the-end-of-a-buffer-within-the-same-page-on-x86-and-x64), `__m128i a = _mm_loadu_si128( (const __m128i*)buffer )` would be the standard way. `_mm_cmpestr*` takes a length arg to ignore garbage past that length. (Although it's slower than `pcmpistri` which looks for a `0` byte in the data, for implicit-length C strings). – Peter Cordes Aug 22 '21 at 20:53
  • 2
    Similarly, `b = _mm_setr_epi8('G', 'E', 'T', ' ', 0,0,0,...)` would be the normal way to init that operand. (Or `_mm_loadu_si32` from a string literal or char array would work, or even `_mm_cvtsi32_si128` from a `uint32_t` with the right little-endian hex value.). – Peter Cordes Aug 22 '21 at 20:57
  • @Peter Cordes yes but OP provided virtually no constrains on length of the buffer. Neither any detail what is done after '_mm_cmpestrc'. Loading the whole vector may segfault. Using memcpy is a bit safer option – tstanisl Aug 22 '21 at 20:59
  • Only fully safe in terms of C UB if you zero the vector object first, otherwise it can be partially uninitialized, but logically `_mm_cmpestrc` reads the whole `__m128i`. And yes, as I said, to be able to do the efficient thing (loading exactly 16 bytes, not "the whole buffer"), you need to make sure the caller never passes a pointer that right near the end of a page. The whole point of using SIMD is performance, so if you shoot yourself in the foot in the process of using it, you're defeating the purpose. – Peter Cordes Aug 22 '21 at 21:01
  • I think the issue here is my implementation, as well as both buffers being the incorrect type. The buffer can be of any size technically. I'm not sure how to answer this, as this answers the question. I have modified my implementation, and now it contains a bug rather than not compiling, so i will mark this with an upvote, no answer, and make a new question. – Christopher Clark Aug 22 '21 at 23:34
  • I meant "I'm not not sure how to vote on this, as this answers the question" for some reason i can't edit my previous comment. – Christopher Clark Aug 22 '21 at 23:44