Assume There are four nested loop with different loop counter and conditions. Is there any way to tell the compiler (icc,gcc and clang) that transform all loop to one loop?
N=128; M=128; P=3; Q=3; //All these variables are constant
for (n=0; n<N; n++){
for(m=0; m<M; m++){
temp=0;
for(p=0; p<P; p++){
for(q=0; q<Q; q++){
temp += kernel[p][q] * input[n+p][m+q];
}
}
output[n][m]=temp;
}
}
To be transformed to:
for(;;)
//computations...
In my experience this is useful when you rely on auto-vectorization. If there is a way to transform the two nested loops that will work as well. some thing that solved this question but with hand written codes. I have a program and you can see it here in godbolt.