I need to optimize my matrix multiplication by using SIMD/Intel SSE. The example code given looks like:
*x = (float*)memalign(16, size * sizeof(float));
However, I am using C++ and [found that][1]
I instead of malloc
(before doing SIMD), I should use new
. Now, I'm further optimizing via SIMD/SSE, so I need aligned memory, so question is: do I need memalign
/_aligned_malloc
or is my array declared like
static float m1[SIZE][SIZE];
already aligned? (SIZE
is an int)