3

I'm using Google/benchmark for a project, and I just started playing around with the --benchmark_perf_counters flag. Clearly I'm doing something wrong, since the perf counters are often negative. I'm assuming it's an issue w/ overflow, but I still don't quite understand how the counters work to begin with.

For example, how is CACHE-MISSES 0 on the first benchmark, and then -372k on the second? Neither of those values make sense to me.

(the two benchmarks have very similar parameters and runtime)

I'm running on Ubuntu 18.04 w/ an Intel(R) Xeon(R) Gold 6138 CPU. Google benchmark version is 1.6.1 and I have libpfm4-dev installed. I'm calling my benchmark binary w/ --benchmark_perf_counters=CYCLES,INSTRUCTIONS,CACH-MISSES

-----------------------------------------------------------------------------------------------------
Benchmark                                           Time             CPU   Iterations UserCounters...
-----------------------------------------------------------------------------------------------------
bit::shift_left (small) (AA)                     3.15 ns         3.15 ns    221185726 CACHE-MISSES=0 CYCLES=11.0005 INSTRUCTIONS=15
bit::shift_left (small) (UU)                     2.65 ns         2.65 ns    254254663 CACHE-MISSES=-372.709k CYCLES=553.131k INSTRUCTIONS=372.709k
boost::shift_left (small) (AA)                   2.71 ns         2.71 ns    258007443 CACHE-MISSES=-367.288k CYCLES=-367.288k INSTRUCTIONS=3.87586n
std::shift_left (small)                          23.5 ns         23.5 ns     29812478 CACHE-MISSES=-3.17853M CYCLES=-102.703 INSTRUCTIONS=-972.747n
Throckmorton
  • 564
  • 4
  • 17
  • 1
    On what system? OS, CPU, software versions? – Peter Cordes May 09 '22 at 19:21
  • @PeterCordes I've updated the question w/ that information. – Throckmorton May 09 '22 at 19:24
  • I assume you're running on bare metal, or otherwise have access to working perf counters, so known-good software like `perf stat` work? (Maybe check by running `perf stat ./a.out` to check the counts for instructions and cycles (and thus average clock speed) on this or anything else). – Peter Cordes May 09 '22 at 19:46
  • Yea, I've used `perf` a number of times on the same machine w/ no problem. – Throckmorton May 09 '22 at 22:41

1 Answers1

1

I had this same problem! Upgrading to the latest version of Google Benchmark (1.7.0) fixed it and now my perf counter results make sense.