I'm trying to use loop unrolling to optimize my code.
This was the original code
int a[N]; //arbitrary array
int vara; //arbitrary variable
int varb; //arbitrary variable
for (int i=0;i<N;i++)
a[i]=(a[i+1]* vara) + varb;
so I tried doing this
for (int i=0;i<N-1;i+=2)
{
int a=a[i+1]*vara;
int b=a[i+2]*vara;
int c=a+varb;
int d=b+varb;
a[i]=c;
a[i+1]=d;
}
I thought this would work because I'm enabling the compiler to do addition and multiplication for multiple iterations at a time, which I thought would increase instruction level parallelism. Yet doing this does not speed up my code at all, what am I doing wrong?
Any other suggestions to optimize this code would also be much appreciated.