how to debug a _mm_mul_ps function?

Question

I've this code:

inline __m128 process(const __m128 *buffer) {
    __m128 crashTest;
    for (int i = 0; i < mFactor; i++) {
        crashTest = _mm_mul_ps(buffer[i], _mm_set1_ps((float)(((int32_t)1) << 16)));
    }

    return crashTest;
}

when I call it with some "buffer", it crash the application (i.e. Segmentation fault).

How can I debug it? To discover which value will cause the crash? Tried a try catch, but it doesn't catch the segmentation fault.

Can't "cout" the value, because i'm inside an heavy "audio" process (such as 44100 x n cout in a sec, which freeze the i/o).

Any tips?

Check that `buffer` is 16-byte aligned. Also check for the usual sort of memory bugs: `buffer` is null, corrupted, used after freed, has fewer than `mFactor` elements, etc. — Nate Eldredge, Aug 03 '21 at 17:43
@NateEldredge buffer is defined as `__m128 oversampleBuffer[kOversample];` - byte aligned right? — markzzz, Aug 03 '21 at 19:48
`alignof(__m128) = 16`, so declaring an actual array of that type should result in 16-byte alignment if it's in static or automatic storage. (Only C++17 really respects over-alignment with `new`, although x86-64 has `alignof(max_align_t) = 16` so that's already sufficient for SSE.) — Peter Cordes, Aug 03 '21 at 20:30
Anyway, check the address with a debugger when it crashes; the low hex digit must be `0` to use it this way. — Peter Cordes, Aug 03 '21 at 20:32
Is `kOversample` guaranteed to be >= `mFactor` ? Maybe add an`assert` to check for this ? — Paul R, Aug 03 '21 at 20:44
*How can I debug it? To discover which value will cause the crash?* - Run it inside a debugger so you can look at variable values (and memory) when it crashes. That's what debuggers are for; don't waste your time catching SIGSEGV manually. — Peter Cordes, Aug 03 '21 at 23:12

Soonts · Accepted Answer · 2021-08-04T00:48:33.827

_mm_mul_ps is not a function. It looks like one, but it compiles into a single instruction, depending on compiler settings either mulps or vmulps. The output is well defined over complete range of inputs, does the right thing even with weird values like INF, NAN or denormals.

If that function crashes, the probable reason is memory access. Most likely out of bounds access to the buffer argument. Another possible reason is the argument not being 16-byte aligned, albeit that only crashes when compiling into mulps SSE instruction but not the vmulps AVX instruction. In both cases, no amount of printing gonna help: you'll simply move the crash from _mm_mul_ps into your vector printing function.

If for some reason you can't use a debugger, #include <assert.h> and implement a few checks there.

Checking for range is unreliable and platform-dependent, but still, you can use VirtualQuery API on Windows, and parse all these numbers from /proc/self/maps text file on Linux.

Checking for alignment is trivial though, assert( 0 == ( ((size_t)buffer) % 16 ) );

P.S. The best long-term solution however, add buffer size argument. Or supply another pointer for the end of the input buffer. Or replace the raw pointer with const std::vector<__m128>&. With all of these approaches, you'll be able to detect out of bounds access and fail gracefully with an exception instead of crashing the process with access violation.

`assert(0 == (((size_t)oversampleBuffer) % 16));` do nothing. So I think its being aligned? For sure, I've added `alignas(16) __m128 oversampleBuffer[kOversample];` , but the problem remains. So should be elsewhere. I'll check the size of arrays/factor and let you know. — markzzz, Aug 04 '21 at 08:18

how to debug a _mm_mul_ps function?

1 Answers1