12

I'd like to use hardware performance counter, specifically x86 CPUs to obtain cache misses or branch mis-prediction. Performance counters are heavily used in advanced profilers like Intel VTune. Please don't be confused performance counters on Windows operating systems.

In order to use these counters in C/C++ program, one may use PAPI: http://icl.cs.utk.edu/papi/

This allows you to easily use performance counters, but on only Linux. PAPI once supported Windows, but not now.

Is there anyone who recently tried PAPI or other APIs to use hardware performance counters on Windows?

Nullptr
  • 3,103
  • 4
  • 27
  • 28
  • Mind if I ask: Are you writing real bang-on-bits code? Most windows app code gets nowhere near that. – Mike Dunlavey Jan 06 '12 at 23:15
  • I was going to suggest VTune, but you already brought it up. So what's wrong with it? – Mahmoud Al-Qudsi Jan 07 '12 at 04:39
  • No, I'm writing some profiling code. So, I need APIs to program. Linux is okay with PAPI, but Windows, I'm still looking for latest APIs to use HW performance counters. – Nullptr Jan 07 '12 at 07:01
  • 3
    There seems to be no general API. When you need this, you have to build and use your own driver. Because open-source projects cannot sign a driver, these projects died out. You can use a selfsigned driver when you enable testsigning. – Christopher Jan 10 '12 at 00:25
  • psapi still seems to include its old windows code in its latest release, make be better to update it and submit a patch (as I highly doubt you'll find a cross-platform API that is up-to-date). – Necrolis Jan 10 '12 at 06:31

2 Answers2

8

You can use RDPMC instruction or __readpmc MSVC compiler intrinsic, which is the same thing.

However, Windows prohibits user-mode applications to execute this instruction by setting CR4.PCE to 0. Presumably, this is done because the meaning of each counter is determined by MSR registers, which are only accessible in kernel mode. In other words, unless you're a kernel-mode module (e.g. a device driver), you are going to get "privileged instruction" trap if you attempt to execute this instruction.

If you're writing a user-mode application, your only option is (as @Christopher mentioned in comments) to write a kernel module which would execute this instruction for you (you'll incur user->kernel call penalty) and enable test signing on your machine so your presumably self-signed "driver" can be loaded. This means you can't easily distribute this app, but that'll work for in-house tuning.

Rom
  • 4,129
  • 23
  • 18
2

What about this HCP Reference? Does it not provide what you want?

wilx
  • 17,697
  • 6
  • 59
  • 114