List of predefined perf
events like branches
cycles
LLC-load-misses
is documented by the source code of perf subsystem inside Linux kernel. The list is mapped partially and to various hardware event for different CPU models and microarchitectures. It can be more useful to use ocperf.py
(and toplev.py) from andikleen's pmu-tools (if your CPU is Intel) with event names from Intel documentations (ocperf is not official, but it is written by Intel employee and uses official lists from https://download.01.org/perfmon/ https://download.01.org/perfmon/readme.txt "This package contains performance monitoring event lists for Intel processors")
For x86 and x86_64 perf
these (ancient) predefined/generic names are mapped at arch/x86/events
directory, for example for all Intel Core microarchitecures check arch/x86/events/intel/core.c
and search for microarchitecture by its code name (Core, Core2, NHM=Nehalem, WSM=Westmere, SNB=SandyBridge, IVB=IvyBridge, HSW=HaSWell, BDW=BroaDWell,SKL=SKyLake, SLM=SiLverMont and other from lists and amd). For Skylake there is structure at line 394 of intel/core.c of 4.15.8, and we see that PREFETCH counters are not mapped for all caches ("not supported")
static __initconst const u64 skl_hw_cache_event_ids
[ C(L1D ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_INST_RETIRED.ALL_LOADS */
[ C(RESULT_MISS) ] = 0x151, /* L1D.REPLACEMENT */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_INST_RETIRED.ALL_STORES */
[ C(RESULT_MISS) ] = 0x0,
...
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
and extra structure to define additional flags/masks for events like OFFCORE_RESPONSE:
static __initconst const u64 skl_hw_cache_extra_regs
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
[ C(NODE) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,