I'm vectorizing a part of my program but it returns Segmentation fault
error. What is wrong with this? Here it is the simplified section, that cause the problem. j++
and i++
is exactly what I want, I do not want to be j += 16
.
unsigned short int input[256][256] __attribute__((aligned(32)));//global
for (i = 0; i < 256 - 16; i++) {
for (j = 0; j < 256 - 16; j++) {
temp_v2 =_mm256_load_si256((__m256i *)&input[i][j]);
}
}