0

I'm trying to implement a high-performance C++ program, each cycle I load 8 bytes to MMX register and then process them, but of course I want to stop when I hit the end of the string.

So this is the solution I found, each cycle load 8 bytes, compare each byte with \0, if there is a \0 then take precautions. The problem with this is, if my data is 4 bytes and in the first cycle I load 8 bytes, then I load 4 bytes from another applications memory space.

Will this cause me trouble? or will just "noise" come from these bytes which is totally acceptable for me, because I will handle it as soon as I learn about \0 character.

EralpB
  • 1,621
  • 4
  • 23
  • 36

2 Answers2

4

SSE2 has been a thing since 2001 and is essentially universally supported now, but maybe you have a good reason to stick with MMX (targeting an embedded P3 maybe?)

Anyway the problem persists in SSE2, and yes it is bad to do arbitrary loads that can extend beyond the region of memory known to be valid. C++ insists that any loads beyond it are bad, but in practice the only way it can make any difference is if you touch the next page and it isn't valid.

Using aligned loads (MMX did not discriminate between aligned and unaligned loads, but you can still align the address of course) ensures that if the first byte you're loading is on a valid page, then the last byte is also. So if you first process byte-by-byte until you're at an aligned address and then continue with aligned loads, you'll be fine.

harold
  • 61,398
  • 6
  • 86
  • 164
0

If you are using SIMD instructions to achieve greater performance, it is also reasonable to use your own memory allocation. In your case, you need to allocate memory blocks that are multiples of the width of used SIMD instructions: 8 for MMX, 16 for the SSE, 32 for AVX. To do this is better to use the standard functions _mm_malloc and _mm_free (for Visual Studio) or posix_memalign (for GCC).

ErmIg
  • 3,980
  • 1
  • 27
  • 40
  • This isn't sufficient by itself. It is quite common to do `j = strchr (foo, ' '); if (j!=NULL) strcpy(k, j);` You need extra space on the end of every string. – David Schwartz Nov 25 '15 at 12:53
  • In any case a memory alignment will not hurt for using of SIMD instructions. – ErmIg Nov 25 '15 at 13:02