7

I have this C function:

double f(int x)
{
    if (x <= 0)
        return 0.0;
    else
        return x * log(x);
}

which I am calling in a tight loop, and would like to get rid of the branch to see if it improves performance.

I cannot use this:

double f(int x)
{
    return x * log(x);
}

because it returns NaN when x == 0 (which is true about 25% of the time.)

Is there another way to implement it so that it returns 0 when x == 0, but still get rid of the branch?

(I am less concerned about negative inputs, because these are errors, whereas zeros are not.)

finnw
  • 47,861
  • 24
  • 143
  • 221
  • `?: ` is also branching but if you want to get rid of `if-else` then you can use it – Omkant Nov 15 '12 at 17:55
  • Is there a reasonably finite range of values for `x`, or is it more or less unconstrained? – user229044 Nov 15 '12 at 17:56
  • @Omkant: It is the actual branching I am concerned about, not the syntax. – finnw Nov 15 '12 at 17:57
  • @meagar: it can be anything between 0 and 2^52 – finnw Nov 15 '12 at 17:58
  • 1
    Well, you *do* have a piecewise function. It's hard to imagine how to evaluate it without a condition on the pieces. But check your assembly, which may replace the branch by a conditional-move anyway. – Kerrek SB Nov 15 '12 at 17:59
  • 5
    Performance of this calculation is going to be completely dominated by `log`. Therefore, you *want* the branch in there, because you don't want to call `log` if you're going to throw away the answer. – zwol Nov 15 '12 at 18:03
  • 1
    Logarithm is implemented as a function on x86-64. It's unlikely that not branching but calling the expensive function always brings more than branching and occassionally not calling it. – fuz Nov 15 '12 at 18:04
  • 5
    For the purposes of testing performance, just replace the function with `return x * log(x)`. That gives the wrong answer, sure, but it's no slower than whatever branch-free code you could possibly come up with. So unless it's dramatically faster than what you have, you can stop. There's no need to actually come up with the branch-free code because you've established that it won't help. – Steve Jessop Nov 15 '12 at 18:07
  • @meagar: small values (≤20) are much more common than larger values though, so if I am going to have the branch anyway it might be worth keeping them in a table. Is that what you were thinking of? – finnw Nov 15 '12 at 18:07

3 Answers3

13

First note that log(1) = 0. Then you can write the problem as x * log(y), where y = 1 if x <= 0, and otherwise equals x; if y = 1, then x doesn't matter, because log(y)=0.

Something like y = (x > 0)*x + (x <= 0) will do this, and then:

double f(int x) {
    return x * log((x > 0)*x + (x <= 0));
}

It just depends on whether log(1) and four integer ops are worse than a branch.

Hodapp
  • 161
  • 4
  • 2
    Works. As you said, it's probably a lot slower than a branch. It does provide a perfect answer to the OP's question, though, so +1. – Tim Nov 15 '12 at 18:27
  • I'd be curious if log(1) is a special case that bails out early. Even then, though, the two comparisons, add, and multiply may take longer than a branch. – Hodapp Nov 15 '12 at 18:29
  • Of course it's down to the compiler whether this code is actually branch-free or not, but it has the nice feature of looking like it might be ;-) – Steve Jessop Nov 15 '12 at 21:11
  • But where would a compiler get a branch out of this? From the comparison operators? – Hodapp Nov 15 '12 at 21:18
  • @Hodapp: correct. x86 has `setle` and `setg` conditional instructions, ARM allows all instructions to be conditional, but not every architecture does. – Steve Jessop Nov 16 '12 at 09:34
  • a minor variant would be to cache the comparison: `double f(int x) { int b = x > 0; return x * log(b*x + (1-b)); }`. Almost certainly makes no difference here, but may be helpful in other circumstances where the condition `b` is more complex. – wren romano Nov 08 '15 at 01:56
11

Compiler extensions can help here. In GCC, you would do this:

if(__builtin_expect(x > 0, 1)) {
    return x * log(x);
}
return 0.0;

GCC will then generate machine code that favors the x > 0 == 1 branch.

If you don't care about negative numbers, then you can treat x == 0 as an unlikely branch instead:

if(__builtin_expect(x == 0, 0)) {
    return 0.0;
}
return x * log(x);

If you're not on GCC, you should check the documentation of your compiler and see whether it provides an analogous feature.

Note that it's still not branch-free. It's just that the likely branch takes less time.

Nikos C.
  • 50,738
  • 9
  • 71
  • 96
  • When `x` is 0 “true about 25% of the time”, I do not think that favoring `x > 0` helps. – Pascal Cuoq Nov 15 '12 at 18:04
  • 1
    @PascalCuoq If `x` is <=0 25% of the time, then it's >0 75% of the time. So that's what favored here. At least that's how I understood the question. – Nikos C. Nov 15 '12 at 18:07
  • This gets rid of the jump in the more frequent case, but not of the branch, as I [understand the term](http://en.wikipedia.org/wiki/Branch_(computer_science)). It probably still helps the OP, so I'm upvoting. – user4815162342 Nov 15 '12 at 18:10
7

Any branch free code must contain a calculation of x * log(x) to cover the "normal" case.

So, before trying to come up with that branch-free code, measure the speed of x * log(x) alone. Unless it's significantly faster than the code you have, there's nothing significant to be gained here. And I suspect it won't be.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699