- If the integer values of variables
a
, b
, c
, ... i
, and r
are monotonic increasing, then the if conditionals can be optimized.
- The repeated access of a structure element and an array element can be optimized to local scaler variables.
- Repeated allocation & deallocation of a local variable can be eliminated for time optimization.
The repetition of the write() syscall can be eliminated for space optimization and code maintainability.
A binary comparison algorithm can improve upon a sequential comparison.
Instead of a maximum of executing ten if statements, the maximum could be four (plus a boolean test for the syscall).
The original code has two integer comparisons and a logical operation (which are not that expensive) in each if, but that can be optimized to just one comparison in each if.
The following (is tested and) may be optimal for space and time.
void Path(Point j)
{
int val = j.y;
char moveMsg = 0;
/* ASSERT(0 < a < b < c < d < e < f < g < h < i < r) */
if (val <= e) {
if (val <= b) {
if (val <= a) {
if (val > 0) /* && val <= a */
moveMsg = 'j';
} else /* val <= b && val > a */
moveMsg = 'i';
} else {
if (val <= c) /* && val > b */
moveMsg = 'h';
else if (val <= d) /* && val > c */
moveMsg = 'g';
else /* val <= e && val > d */
moveMsg = 'f';
}
} else {
if (val <= h) {
if (val <= f) /* && val > e */
moveMsg = 'e';
else if (val <= g) /* && val > f */
moveMsg = 'd';
else /* val <= h && val > g */
moveMsg = 'c';
} else {
if (val <= i) /* && val > h */
moveMsg = 'b';
else if (val <= r) /* && val > i */
moveMsg = 'a';
}
}
if (moveMsg)
write(fd1, &moveMsg, 1);
}
The savings in time is highly dependent on the distribution of the input values. If the data skews low then the savings is small, whereas if it skews to large values or uniform distribution then the savings is better.
The variation in execution time is also no longer a function of the input value; of course for integer comparisons this variation isn't large.
... the receiving code is optimized, while running through this segment is taking around 600 clock cycles, and I feel that it could be cut down from that.
Except for the binary algorithm, these are all well-known optimizations that a good optimizing compiler can perform for you.
Since hand optimization is becoming a useless/lost skill, I'm inclined to be dubious of your claim that "the receiving code is optimized".
The best optimizations are the proper algorithm and proper use of syscalls (e.g. the syscall to write only one byte will consume the bulk of the execution time of this routine).
Addendum
Could you explain why you did the if statements like that? Why would gong through more if statements help make it faster?
The optimized code has 11 if statements to find the range; that's only one "more" than the original code.
Moreover each if statement is just one simple integer comparison, versus the original's compound logical expression of two comparisons.
In terms of the number of actual ALU operations (rather than source code complexity), the original code is clearly more expensive to execute than the
optimized code.
The optimized code could typically execute fewer (not more) if statements and perform fewer comparisons than the original code.
The apparent simplicity of that original code has a well-known inefficiency, since it employs a linear or sequential (i.e. one after the other) search.
The worst case scenario with the original code occurs when the input value is equal to the maximum value r
.
That is when the original code has to execute all 10 if statements and perform 20 integer comparisons.
The optimized code uses a binary search that executes 4 if statements and performs only 4 integer comparisons for that same input.
The worst case scenario with the optimized code is execution of 4 if statements and perform 4 integer comparisons.
The best case scenario with the original code occurs when the input value is equal to the minimum value a
.
The original code has to execute just 1 if statement, but that does involve 2 integer comparisons.
The optimized code would execute 4 if statements and performs 4 integer comparisons for that same input.
That's only 2 more comparisons, and it's the number of comparisons that influence execution time, not the number of if statements in the source code.
The penalty of two more comparisons for this best case is offset by the significant advantage of 16 fewer comparisons for the worst case.
The average case favors the optimized code using a binary search.
The original's 5 if statements performing 10 integer comparisons is slower than than the optimized 4 if statements performing only 4 integer comparisons.
That's an advantage of 6 integer comparisons.
In fact unless the input value is consistently not more than a
(i.e. the first interval), then the optimized code will execute the same number or fewer integer comparisons than the original code, despite the appearance of having more if statements.
That seems counterintuitive to me.
With search (and sorting) algorithms, simple or straightforward usually means slow.