Questions tagged [papi]

PAPI (Performance Application Programming Interface) provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events.

86 questions
3
votes
3 answers

How to count memory accesses to remote NUMA memory nodes?

In a multi-threaded application running on a recent linux Distributed Shared Memory system, is there a straight forward way to count the number of requests per thread to remote (non-local) NUMA memory nodes? I am thinking of using PAPI to count…
nandu
  • 2,563
  • 2
  • 16
  • 14
3
votes
3 answers

Profiling Cache hit rate of a function of C program

I want to get cache hit rate for a specific function of a C/C++ program (foo) running on a Linux machine. I am using gcc and no compiler optimization. With perf I can get hit rates for the entire program using the following command. perf stat -e…
Atanu Barai
  • 115
  • 7
3
votes
1 answer

Reading hardware counters from perf_event_uncore list with PAPI

I am trying to read one of the Hardware counters with PAPI. When I try to read events from perf_event list, it works fine. However now I need to read one of the counters from perf_event_uncore list, which is obtained with papi_native_avail, but I…
Ana Khorguani
  • 896
  • 4
  • 18
3
votes
0 answers

Changing irrelevant part of the function changes papi measurement of branch prediction

I am playing with the codes that I found online and I want to try different branch prediction codes to have a better understanding of branch predictors. CPU is AMD Ryzen 3600. Basically, what I am doing is in the code below, I am trying to measure a…
user12527223
3
votes
1 answer

How to define a user-defined event to be measured by PAPI?

Most of today's processors are equipped with hardware performance counters. Such counters can be used to count micro-architecture events in order to analyse the target program to improve its performance. Generally, profiling and analysing are the…
3
votes
0 answers

Installing papi on virtualbox(use counters on virtual machine )

I'm using ubuntu with virtualbox on windows 10. I'm having problem to use any function of the papi library and getting (Error in PAPI_flops: Event does not exist) error. When I run the make test of the installation guide, I have this: But the perf…
Ms wolf
  • 51
  • 5
3
votes
1 answer

How to correctly measure IPC (Instructions per cycle) with perf

I wonder how to measure instructions per cycle correctly using perf. As reference: http://www2.engr.arizona.edu/~tosiron/papers/SPEC2017_ISPASS18.pdf used inst_retired.any and cpu_clk_unhalted.ref_tsc for their calculations, and I'm now wondering if…
pointhi
  • 303
  • 1
  • 5
  • 13
3
votes
0 answers

Profiling cache misses for separate pthread using PAPI

I am trying to investigate the performance of my program, whereas cache misses is a huge bottleneck. For testing purposes, before implementing PAPI in to the target application, I needed to verify how stuff works, which is why I posted a sample…
Jakob Danielsson
  • 767
  • 1
  • 8
  • 16
3
votes
1 answer

Fixing COMPSs tracing error: PAPI_read failed for thread X evtset X (papi_hwc.c:*)

I am trying to run COMPSs with the tracing system (extrae) activated. I first had an installation issue but I solved it thanks this question: How to fix libpapi.so.* cannot open shared object file when running (py)COMPSs with tracing? However, now I…
Cristian Ramon-Cortes
  • 1,838
  • 1
  • 19
  • 32
3
votes
1 answer

How to fix libpapi.so.* cannot open shared object file when running (py)COMPSs with tracing?

When I try to run some COMPSs application with the tracing system activated I get the following error: libpapi.so.5.3.0.0 cannot open shared object file I am using ubuntu and I have installed COMPSs from the packages with apt-get. To launch the…
3
votes
0 answers

How do I properly use papi_native_avail to get network performance monitoring events on a BG/Q system?

I'm trying to gather network performance counter data on a BG/Q system with a BG Torus interconnect. I'm using PAPI as this seems to be the most recommended way of doing it, with the other option being the bgpm library, which I don't think is…
Patrick
  • 51
  • 6
3
votes
1 answer

Counting integer operations on Sandy Bridge

I'd like to calculate the computational intensity of my code, but it works with integers, not floats. I thought about counting the number of operations with PAPI, but the hardware doesn't provide counters for integer operations. How can I do this?
a3mlord
  • 1,060
  • 6
  • 16
2
votes
0 answers

Is pid in perf_event_open() actually tid?

Under sample mode, when I test on all cpus, pid = 1: fd = perf_event_open(&attr, 1, -1, -1, 0); I always see pid == tid: ********************************** value of meta_page: 7fd262ce2000 value of meta_page->data_head: 9838 value of…
2
votes
0 answers

How to aggregate PAPI uncore events such as skx_unc_imc0 to measure for all devices?

PAPI counts events per device (which can be iMC, caching and home agent(cha)). These are counted as separate events such as skx_unc_imc0::UNC_M_RPQ_OCCUPANCY for iMC 0. Is there a way to measure this for all the iMCs at the same time? The Linux perf…
2
votes
0 answers

PAPI data cache miss inconsistencies in multithreaded benchmark

I have a problem regarding the consistency of performance counters in my benchmark application. Namely, the reported number of L1 data cache misses exceeds the L1 data cache loads occasionally and is not what I expect. The benchmark basically fills…
fresapore
  • 23
  • 3