I'm using PAPI
to do some measurement and characteristic work, I've got several questions about the events on Ivy Bridge.
Based on the SDM Table 19-5 (Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel® Core™ i7, i5, i3 Processors), the Ivy Bridge has counters named
CYCLE_ACTIVITY.CYCLES_LDM_PENDING
,CYCLE_ACTIVITY.CYCLES_L1D_PENDING
andCYCLE_ACTIVITY.CYCLES_L2_PENDING
. However, when I triedpapi_native_avail
, I got not only these three but also correspondingSTALLS
events for each one, includingCYCLE_ACTIVITY.STALLS_LDM_PENDING
,CYCLE_ACTIVITY.STALLS_L1D_PENDING
andCYCLE_ACTIVITY.STALLS_L2_PENDING
. And I've also got different numbers withCYCLES
andSTALLS
events. So the question is what is the difference between them.This question is related to the above one, because in the Intel 64 and IA-32 Architectures Optimization Reference Manual Appendix B.3.2.3 all the events mentioned are
STALLS
events, which are not even mentioned in the SDM actually, rather thanCYCLES
events. And the question is what are they supposed to be,CYCLES
orSTALLS
? And which ones should I use to do the memory bound characterization as B.3.2.3 mentioned?There are some formulas in the Appendix B.3.2.3 mentioned above about how to calculate the bound on different level of memory subsystem. One thing I found confusing is that when I did measurement using STALLS events mentioned above, I got larger number on
STALLS_L2_PENDING
thanSTALLS_L1D_PENDING
, while there is a formula in that section shows:
%L2 Bound = (CYCLE_ACTIVITY.STALLS_L1D_PENDING - CYCLE_ACTIVITY.STALLS_L2_PENDING) / CLOCKS
Does this mean my measurement is wrong? If not, then how could I calculate %L2 Bound
since it would be above zero.
The source code is on the following link: https://github.com/yqzhang/SMTM/blob/master/native/native.c
Could someone help me with this?