-4

I've been working on an algorithm for calculating maximum depth of an expression (i.e. how many nested parentheses there are) in various languages just for fun/practice.

I noticed there's a huge performance discrepancy in the performance of the functional styled C code and the imperative styled C code and was wondering why that is.

Given the string "(1+(2*3)+((8)/4))+1" the imperative code finishes consistently in about 10-13us but the functional code takes 2-3us, more than twice as fast. Both algorithms are compiled with -O2 and gcc, so I found this extremely surprising, but I don't know enough about the compiler's implementation to understand why.

So can anyone tell me why the functional code is so significantly faster?

Functional code (note the _ERR stuff are just #define's with integers):

const int max_depth_functional(
        const char *expr, const int sum, const int max) {
    switch(*expr) {
        case '\0':
            return sum == 0 ? max : UNTERM_PARENTH_ERR;
        case '0': case '1': case '2': case '3': case '4':
        case '5': case '6': case '7': case '8': case '9':
        case '+': case '-': case '*': case '/': case '^':
            return max_depth_functional(expr + 1, sum, max);
        case '(':
            return max_depth_functional(
                expr + 1, sum + 1, sum + 1 > max ? sum + 1 : max
            );
        case ')':
            return max_depth_functional(expr + 1, sum - 1, max);
        default:
            return INVALID_EXPR_ERR;
    }
}

Imperative code:

const int max_depth_imperative(const char *expr) {
    int curr_sum = 0, curr_max = 0;
    while(*expr != '\0') {
        switch(*expr++) {
            case '0': case '1': case '2': case '3': case '4':
            case '5': case '6': case '7': case '8': case '9':
            case '+': case '-': case '*': case '/': case '^':
                break;
            case '(':
                curr_sum++;
                curr_max = curr_sum > curr_max ? curr_sum : curr_max;
                break;
            case ')':
                curr_sum--;
                break;
            default:
                return INVALID_EXPR_ERR;
        }
    }
    return curr_sum == 0 ? curr_max : UNTERM_PARENTH_ERR;
}

Both are called like:

const clock_t start = clock();
const int func_result = max_depth_func(args[1]);
const clock_t end = clock();

Also, I'm using Linux x86_64 to build and run

Dylan Turner
  • 322
  • 1
  • 12
  • 0) it is not important 1) locality of reference & cache trashing x) generalise the progam to also accept {} and []. xx) maybe even recognise "" and '' strings? – wildplasser Oct 15 '20 at 19:32
  • Const doesn't add anything to a return by value. And [watch this talk](https://youtu.be/koTf7u0v41o) – JHBonarius Oct 15 '20 at 19:35
  • 7
    A single run time measuring is really not representative. Run several thousands of times (on different inputs) and time that. – Eugene Sh. Oct 15 '20 at 19:35
  • Also: check the generated code. Possibly (doubtfull) the tail-recursion is detected and removed. [and please dont use microsecond benchmarks, or run them repeatedly] – wildplasser Oct 15 '20 at 19:39
  • 1) To over-generalize your specific example to "imperative vs. functional style" is simply *wrong*. 2) Per Eugene Sh: a single run is really not representative. 3) If you still see a significant diffence with many runs, then: a) Try different compilers (e.g. gcc vs MSVS), b) try different optimization levels (eg. -Oo vs -O3) and c) Generate assembly output (e.g. gcc/-S, or MSVS/Fa) – paulsm4 Oct 15 '20 at 19:39
  • @EugeneSh. I guess you're right. I just put the calls in a for loop from 0 to 100000 and averaged their times. Result was average of 0.002412ms for imperative and average of 0.002421ms for functional – Dylan Turner Oct 15 '20 at 19:43
  • I had run it multiple times before, but not thousands – Dylan Turner Oct 15 '20 at 19:43

1 Answers1

0

As per comments, I ran the code using:

double imperative_time_sum = 0, functional_sum_time = 0;
for(int i = 0; i < 100000; i++) {
    const clock_t start_imp = clock();
    max_depth(args[1]);
    const clock_t end_imp = clock();
    max_depth_functional_fast(args[1], 0, 0);
    const clock_t end_func = clock();

    imperative_time_sum +=
        1000 * (double) (end_imp - start_imp) / CLOCKS_PER_SEC;
    functional_sum_time +=
        1000 * (double) (end_func - end_imp) / CLOCKS_PER_SEC;
}
printf("Average imperative: %fms\n", imperative_time_sum / 100000);
printf("Average functional: %fms\n", functional_sum_time / 100000);

Which produced results:

Average imperative: 0.002412ms
Average functional: 0.002421ms

Although I reran the program upwards of 100 times before, I hadn't run anywhere close to 100000 times. After that, the times were very close to eachother.

Dylan Turner
  • 322
  • 1
  • 12