How can I get my CPU's branch target buffer(BTB) size?

Question

It's useful when execute this routine when LOOPS > BTB_SIZE, eg,

from

int n = 0;
for (int i = 0; i < LOOPS; i++)
    n++;

to

int n = 0;
int loops = LOOPS / 2;
for(int i = 0; i < loops; i+=2)
    n += 2;

can reduce branch misses.

BTB ref:http://www-ee.eng.hawaii.edu/~tep/EE461/Notes/ILP/buffer.html but it doesn't tell how to get the BTB size.

Check http://xania.org/201602/bpu-part-one Static branch prediction on newer Intel processors http://xania.org/201602/bpu-part-two Branch prediction - part two and later his publications in same tag (http://xania.org/Microarchitecture-archive); test code is at https://github.com/mattgodbolt/agner (tests/btb*py) and at https://github.com/rmmh/whomp — osgx, Jul 21 '16 at 20:06

score 0 · Answer 1 · answered May 13 '13 at 02:11

Any modern compiler worth its salt should optimise the code to int n = LOOPS;, but in a more complex example, the compiler will take care of such optimisation; see LLVM's auto-vectorisation, for instance, which handles many kinds of loop unrolling. Rather than trying to optimise your code, find appropriate compiler flags to get the compiler to do all the hard work.

score 0 · Answer 2 · answered Aug 08 '16 at 19:38

From the BTB's point of view, both versions are the same. In both versions (if compiled unoptimized) there is only one conditional jump (each originating from the i<LOOPS), so there is only one jump target in the code, thus only one branch target buffer is used. You can see the resulting assembler code using Matt Godbolt's compiler explorer.

There would be difference between

for(int i=0;i<n;i++){
    if(i%2==0)
        do_something();
}

and

for(int i=0;i<n;i++){
    if(i%2==0)
        do_something();
    if(i%3==0)
        do_something_different();
}

The first version would need 2 branch target buffers (for for and for if), the second would need 3 branch target buffers (for for and for two ifs).

However, how Matt Godbolt found out, there are 4096 branch target buffers, so I would not worry too much about them.

How can I get my CPU's branch target buffer(BTB) size?

2 Answers2