I have an Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz
(Haswell
) processor. In a relatively idle situation, I ran the following Perf
commands for around 5 seconds. The counters are offcore_response.all_data_rd.l3_miss.local_dram
and offcore_response.all_code_rd.l3_miss.local_dram
:
sudo perf stat -e offcore_response.all_data_rd.l3_miss.local_dram,offcore_response.all_code_rd.l3_miss.local_dram -p <PID>
The workloads are: 1) playing a video in VLC
and 2) running KDevelop
indexer on a large code base. The outputs are shown, below:
VLC:
Performance counter stats for process id '14617':
1,621,980 offcore_response.all_data_rd.l3_miss.local_dram
1,611,825 offcore_response.all_code_rd.l3_miss.local_dram
4.993841802 seconds time elapsed
KDevelop:
Performance counter stats for process id '23294':
31,006,390 offcore_response.all_data_rd.l3_miss.local_dram
10,236,222 offcore_response.all_code_rd.l3_miss.local_dram
5.095681532 seconds time elapsed
Based on these statistics, the memory access frequency in KDevelop
is more than 12 times as much as VLC
.
But the IMC counters statistics (retrieved using PCM
) are at odds with the above-mentioned performance counters. In the idle system, the total system bandwidth is around 2.65
GB (READ: 2.30
GB, WRITE: 0.35
GB). The total system bandwidth for each workload (ran separately) is as follows:
VLC:
around `8.40`GB (READ:`4.65`GB, WRITE:`3.75`GB)
KDevelop:
around `3.75`GB (READ:`3.15`GB, WRITE:`0.60`GB)
After reducing the idle system bandwidth, the VLC
and KDevelop
bandwidths will be around 5.75
GB and 1.10
GB, respectively. This time, the VLC
memory access frequency is more than 5 times as much as KDevelop
, which shows an obvious conflict.
How can these two outcomes be described?