25

I've come across a few scenarios where I want to say a function's return value is likely inside the body of a function, not the if statement that will call it.

For example, say I want to port code from using a LIKELY macro to using the new [[likely]] annotation. But these go in syntactically different places:

#define LIKELY(...) __builtin_expect(!!(__VA_ARGS__),0)
if(LIKELY(x)) { ... } 

vs

if(x) [[likely]] { ... }

There's no easy way to redefine the LIKELY macro to use the annotation. Would defining a function like

inline bool likely(bool x) { 
  if(x) [[likely]] return true;
  else return false;
}

propagate the hint out to an if? Like in

if(likely(x)) { ... }

Similarly, in generic code, it can be difficult to directly express algorithmic likelihood information in the actual if statement, even if this information is known elsewhere. For example, a copy_if where the predicate is almost always false. As far as I know, there is no way to express that using attributes, but if branch weight info can propagate through functions, this is a solved problem.

So far I haven't been able to find documentation about this and I don't know a good setup to test this by looking at the outputted assembly.

Riley
  • 982
  • 1
  • 7
  • 19
  • 2
    It certainly seems logical that this *could* be done. The question then would be whether any given compiler is actually smart enough to do it. Is there a particular compiler you have in mind? Of course as far as the standard is concerned, no compiler is obliged to do anything at all with the hint. – Nate Eldredge Oct 09 '20 at 21:14
  • @NateEldredge Personally, I'm most interested in clang and gcc, but knowing more never hurts. – Riley Oct 09 '20 at 22:32
  • 2
    It's certainly plausible for compilers to track an "expected" value of something after inlining. Especially GCC, given that the semantics of the original `__builtin` are to provide an expected value for a variable, not specific to using it in a branch. (As Nate shows, GCC does this, but clang trunk [seems not to](https://stackoverflow.com/questions/64286836/do-branch-likelihood-hints-carry-through-function-calls/64287545#comment113683915_64287545)) – Peter Cordes Oct 10 '20 at 02:19
  • In 20/20 hindsight, of course, it would've been better to define the macro originally as `#define LIKELY(x) (__builtin_expect(!!(x),0))` and use it as `if LIKELY(x) { ... }`. That way, porting it would've been easy. (Or one could've even defined an `if_likely(x)` macro with the `if` keyword moved into the macro definition.) – Ilmari Karonen Oct 10 '20 at 16:02

3 Answers3

15

The story appears to be mixed for different compilers.

On GCC, I think your inline likely function works, or at least has some effect. Using Compiler Explorer to test differences on this code:

inline bool likely(bool x) { 
  if(x) [[likely]] return true;
  else return false;
}

//#define LIKELY(x) likely(x)
#define LIKELY(x) x

int f(int x) {
    if (LIKELY(!x)) {
        return -3548;
    }
    else {
        return x + 1;
    }
}

This function f adds 1 to x and returns it, unless x is 0, in which case it returns -3548. The LIKELY macro, when it's active, indicates to the compiler that the case where x is zero is more common.

This version, with no change, produces this assembly under GCC 10 -O1:

f(int):
        test    edi, edi
        je      .L3
        lea     eax, [rdi+1]
        ret
.L3:
        mov     eax, -3548
        ret

With the #define changed to the inline function with the [[likely]], we get:

f(int):
        lea     eax, [rdi+1]
        test    edi, edi
        mov     edx, -3548
        cmove   eax, edx
        ret

That's a conditional move instead of a conditional jump. A win, I guess, albeit for a simple example.

This indicates that branch weights propagate through inline functions, which makes sense.

On clang, however, there is limited support for the likely and unlikely attributes, and where there is it does not seem to propagate through inline function calls, according to @Peter Cordes 's report.

There is, however, a hacky macro solution that I think also works:

#define EMPTY()
#define LIKELY(x) x) [[likely]] EMPTY(

Then anything like

if ( LIKELY(x) ) {

becomes like

if ( x) [[likely]] EMPTY( ) {

which then becomes

if ( x) [[likely]] {

.

Example: https://godbolt.org/z/nhfehn

Note however that this probably only works in if-statements, or in other cases that the LIKELY is enclosed in parentheses.

Anonymous1847
  • 2,568
  • 10
  • 16
  • 2
    Looks like you tested with a recent GCC. You forgot to say which compiler in your answer, or to include a godbolt link (https://godbolt.org/z/c6Mr54). clang is different from GCC here. Also, strange that GCC makes branchless code when one way is "known" to be likely; that would be a good opportunity to break the data dependency on the input. It chooses *not* to do if-conversion to `cmov` even at `-O3` without hints. Also, if the condition is `x` instead of `!x`, it uses a `cmp`/`sbb`/`and`/`sub` trick that's pretty questionable: false dep on RAX, no ILP, longer dep chain. – Peter Cordes Oct 10 '20 at 02:32
  • @Peter Cordes Yeah, I agree it's strange the conditional move isn't the default. However this still does show [[likely]] propagates through inline functions, which is the important bit. GCC has, what, 30 years of development behind it now, and this is a pretty simple example and hardly contrived, so it seems like there might be some hidden cost to a cmove over a conditional branch in this case that we're overlooking. The fact that a cmove is made when we tell GCC `x == 0` is more likely suggests this cost is mostly when `x != 0`, i.e. when the branch or cmove is not taken. – Anonymous1847 Oct 10 '20 at 04:31
  • 1
    Don't read *too* much into GCC's decisions. It's a complex piece of machinery, but not "smart" and doesn't have human-level understanding of context, and minor missed-optimization bugs are common (e.g. just look at the `x` instead of `!x` output, or analyze it with IACA or LLVM-MCA). It might be slightly more meaningful if this was part of a loop or something where we could actually talk about whether there's a loop-carried dependency chain, and whether the data dependency is important at all. `cmovz` is a single-uop instruction on AMD, and on Intel Broadwell and later. (2 on earlier Intel) – Peter Cordes Oct 10 '20 at 04:46
  • @PeterCordes I added a second part to my answer with a macro solution. – Anonymous1847 Oct 10 '20 at 06:44
5

gcc 10.2 at least is able to make this deduction (with -O2).

If we consider the following simple program:

void foo();
void bar();

void baz(int x) {
    if (x == 0)
        foo();
    else
        bar();
}

then it compiles to:

baz(int):
        test    edi, edi
        jne     .L2
        jmp     foo()
.L2:
        jmp     bar()

However if we add [[likely]] on the else clause, the generated code changes to

baz(int):
        test    edi, edi
        je      .L4
        jmp     bar()
.L4:
        jmp     foo()

so that the not-taken case of the conditional branch corresponds to the "likely" case.

Now if we pull the comparison out into an inline function:

void foo();
void bar();

inline bool is_zero(int x) {
    if (x == 0)
        return true;
    else
        return false;
}

void baz(int x) {
    if (is_zero(x))
        foo();
    else
        bar();
}

we are again back to the original generated code, taking the branch in the bar() case. But if we add [[likely]] on the else clause in is_zero, we see the branch reversed again.

clang 10.0.1 however does not demonstrate this behavior and seems to ignore [[likely]] altogether in all versions of this example.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • 3
    clang 10.1 says: `warning: unknown attribute 'likely' ignored [-Wunknown-attributes]`, so no wonder it doesn't do anything. :/ Clang trunk supports it, but does *not* propagate it through inlining. (You do see an effect if you tag the `if` in `baz` with `[[likely]]` vs. `[[unlikely]]`: https://godbolt.org/z/EsW5xY) – Peter Cordes Oct 10 '20 at 02:14
3

Yes, it will probably inline, but this is quite pointless.

The __builtin_expect will continue to work even after you upgrade to a compiler that supports those C++ 20 attributes. You can refactor them later, but it will be for purely aesthetic reasons.

Also, your implementation of the LIKELY macro is erroneous (it is actually UNLIKELY), the correct implementations are nelow.

#define LIKELY( x )   __builtin_expect( !! ( x ), 1 )
#define UNLIKELY( x ) __builtin_expect( !! ( x ), 0 )
KevinZ
  • 3,036
  • 1
  • 18
  • 26
  • 3
    `__builtin_expect` requires GNU extensions; ISO C++20 `[[likely]]` does not, and thus can be used even in portable code (including to MSVC). But upvoted for pointing out the bug in the OP's macro! – Peter Cordes Oct 11 '20 at 23:34
  • What about the case of generic code where the actual conditional and the likelihood information aren't defined in the same place? – Riley Oct 12 '20 at 02:29
  • 1
    I also fixed the likely implementation! I believe it needs to be variadic so the preprocessor doesn't balk at a template with multiple arguments. – Riley Oct 12 '20 at 02:32
  • @Riley The existence of "likelihood information", as in `__builtin_expect_with_probability`, is rare and its accuracy and usefulness is always in doubt. If it existed, it better be defined in the same place. Locality of information is important. – KevinZ Oct 12 '20 at 13:08
  • @Riley Allowing commas in your if statement, or essentially executing random stuff before the actual conditional, is a giant code smell. You really shouldn't encourage that, especially in a macro. – KevinZ Oct 12 '20 at 13:22
  • You don't think templated functions etc should be in if statements? – Riley Oct 12 '20 at 20:08
  • 1
    @Riley Ah, so that's why you made the macro variadic. You don't need to do that. If you want to put that multivariate template inside the macro, you can enclose the expression containing the template in an extra set of parenthesis. That way, the commas inside will not be interpreted as separate arguments for the macro. Don't write a variadic macro for something that clearly isn't variadic. – KevinZ Oct 12 '20 at 21:46