Processors are known to have special instructions for decrementing a counter and branch if the counter is zero with very low latency as the branch instruction does not need to wait for the counter decrement passing through an integer unit.
Here is a link to the ppc instruction:
https://www.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.aixassem/doc/alangref/bc.htm
My usual way of doing what I believe triggers a compiler to generate the appropriate instructions is as follows:
unsigned int ctr = n;
while(ctr--)
a[ctr] += b[ctr];
Readability is high and it is a decrementing loop branching on zero. As you see the branch technically occurs if counter is zero before decrement. I was hoping the compiler could do some magic and make it work anyway. Q: Would a compiler have to break any fundamental rules of C in order to mangle it to special decrement and branch conditional instructions (if any)?
Another approach:
unsigned int ctr = n+1;
while(--ctr) {
a[ctr-1] += b[ctr-1];
}
The branch now happen after decrement but there are constants roaming making ugly code. An "index" variable being one less than counter would make it look a little prettier I guess. Looking at available ppc instructions the extra calculation in finding the a and b adress can still fit single instruction as load may also perform adress arithmetic (add). Not so sure about other instruction sets. My main problem though is if n+1 is larger than an max. Q: Will the decrement pull it back to max and loop as usual?
Q: Is there a more commonly used pattern in C for allowing the common instruction?
Edit: ARM has a decrement and branch operation but branches only if value is NOT zero. There appears to be an extra condition just like the ppc bc. As I see it it is from C point of view it is very much the same thing so I expect a code snippet to be compilable to that form too without any C standard violation. http://www.heyrick.co.uk/armwiki/Conditional_execution
Edit: Intel has virtually the same branching instruction as ARM: http://cse.unl.edu/~goddard/Courses/CSCE351/IntelArchitecture/InstructionSetSummary.pdf