To analyze certain attributes of execution times, I was going to use both Perf and PIN in separate executions of a program to get all of my information. PIN would give me instruction mixes, and Perf would give me hardware performance on those mixes. As a sanity check, I profiled the following command line argument:
g++ hello_world.cpp -o hello
So my complete command line inputs were the following:
perf stat -e cycles -e instructions g++ hello_world.cpp -o hello
pin -t icount.so -- g++ hello_world.cpp -o hello
In the PIN commands, I ignored all the path stuff for the files for the sake of this post. Additionally, I altered the basic icount.so
to also record instruction mixes in addition to the default dynamic instruction count. The results were astonishingly different
PIN Results:
Count 1180608
14->COND_BR: 295371
49->UNCOND_BR: 21869
//skipping all of the other instruction types for now
Perf Results:
20,538,346 branches
105,662,160 instructions # 0.00 insns per cycle
0.072352035 seconds time elapsed
This was supposed to serve as a sanity check by having roughly the same instruction counts and roughly the same branch distributions. Why would the dynamic instruction counts be off by a factor of x100?! I was expection some noise, but that's a bit much.
Also, the amount of branches is 20% for Perf, but PIN reports around 25% (that also seems like a tad wide of a discrepancy, but it's probably just a side effect from the massive instruction count distortion).