I'm having trouble debugging my AVX2 code in Visual Studio 2015, update 1 (targeting Win10).
When using the debugger and inspecting an AVX2 register, the contents differs when using a breakpoint and stepping over the _mm256_insertf128_ps-intrinsic (for example) compared to running the program normally. The bug is easy to reproduce. Simply create a new Win console application with the following code in the main function:
1: __m128 lo = _mm_set1_ps(2.0f);
2: __m128 hi = _mm_set1_ps(4.0f);
3: __m256 avx = _mm256_castps128_ps256(lo);
4: avx = _mm256_insertf128_ps(avx, hi, 1);
5: for (int i = 0; i < 8; i++)
6: printf("%.2f\n", avx.m256_f32[i]);
Setting a breakpoint on line 4 and stepping over it causes the following output from the print loop on lines 5-6:
2.00
2.00
2.00
2.00
0.00 <- Wrong!
0.00 <- Wrong!
0.00 <- Wrong!
0.00 <- Wrong!
Running the program gives the following output:
2.00
2.00
2.00
2.00
4.00 <- Correct
4.00 <- Correct
4.00 <- Correct
4.00 <- Correct
I have tried this using both the MSVC and Intel compiler (ver. 16), and both exhibit the same behavior.
Has anyone else stumbled on this problem? Does anyone know what could be the cause for this? Is there any workaround for it?
Thanks in advance!