3

Assuming I have a if-else branch in C++ how can I (in-code) measure how often the branch is mispredicted? I would like to add some calls or macros around the branch (similar to how you do bottom-up profiling) that would report branch mispredictions.

It would be nice to have a generic method, but lets do Intel i5 2500k for starters.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Bartłomiej Siwek
  • 1,447
  • 2
  • 17
  • 26
  • 2
    Maybe something like Valgrind's `massif` profiler... I doubt you can do it "in-code", as the program execution is entirely transparent to the program itself. – Kerrek SB Jan 08 '12 at 20:27
  • Depending on the CPU you may be able to access the CPU's performance registers, but the question lacks sufficient detail to give specific advice. – Paul R Jan 08 '12 at 20:29
  • 2
    Note that the results you get are very much cpu dependent and that you probably can't get accurate measurements by anything intrusive (since changing any code around the branch will change its position, which does can have an effect on the branchprediction). But what exactly do you hop to gain from that information? Does it matter to you how often that specific branch is mispredicted (instead of how long it takes to execute, which might matter)? – Grizzly Jan 08 '12 at 20:40
  • @LuchianGrigore: See http://en.wikipedia.org/wiki/Branch_prediction. – Oliver Charlesworth Jan 08 '12 at 20:53
  • So I added CPU information. The gain I'm hoping for is to have a better idea what optimizations to do - i.e. when I look at the branch I would like to see if mispredictions are even an issue. – Bartłomiej Siwek Jan 08 '12 at 21:02
  • I wonder if gcc `-fprofile-arcs` would help? – Omnifarious Jan 08 '12 at 21:43

3 Answers3

2

If you are using an AMD CPU, AMD's CodeAnalyst is just what you need (works on windows and Linux)*.

if your not, then you may need to fork out for a VTune licence or build something using the on CPU performance registers and counters details in the instruction manuals.

You can also check out gperf & OProfile (linux only), see how well they perform (I've never used these, but I see them referred to quite a bit).

*CodeAnalyst should work on an Intel CPU, you just don't get all then nice CPU level analysis.

Necrolis
  • 25,836
  • 3
  • 63
  • 101
  • Update to an old answer: these days Linux `perf record --all-user -e branch-misses ./a.out` / `perf report` can be useful. (Or use a precise (PEBS) version of the event, like `-e br_misp_retired.all_branches_pebs`.) – Peter Cordes Jul 01 '22 at 04:36
2

I wonder if it would be possible to extract this information from g++ -fprofile-arcs? It has to measure exactly this in order to feed back into the optimizer in order to optimize branching.

Omnifarious
  • 54,333
  • 19
  • 131
  • 194
1

OProfile

OProfile is pretty complex, but it can profile anything your CPU tracks.

Look through the Event Type Reference and look for your particular CPU.

For instance here is the core2 events. After a quick search I don't see any event counters for missed branch prediction on the core2 architecture.

Community
  • 1
  • 1
deft_code
  • 57,255
  • 29
  • 141
  • 224