1

I am running Linux on 32-nm Intel Westmere processor. I have a concern with seemingly conflicting data on DTLB miss numbers from performance counters. I ran two experiments with a random memory access test program (single-threaded) as follows:

  • Experiment (1): I counted the DTLB misses using following performance counter

    DTLB_MISSES.WALK_COMPLETED ((Event 49H, Umask 02H)

  • Experiment (2): I counted the DTLB misses by summing up following the two counter values below

    MEM_LOAD_RETIRED.DTLB_MISS (Event CBH, Umask 80H)

    MEM_STORE_RETIRED.DTLB_MISS (Event 0CH, Umask 01H)

I expected the output of these experiments to be similar. However I found that numbers reported in experiment (1) is almost twice that of in experiment (2). I am at a loss why this is the case.

Can somebody help shed some light on this apparent discrepancy?

horro
  • 1,262
  • 3
  • 20
  • 37
Arka
  • 955
  • 2
  • 12
  • 21

1 Answers1

4

That is expected since the first event counts the number of misses to all TLB levels caused by all possible reasons (load, store, pre-fetch), including memory accesses performed speculatively, while the other two events count only retired (that is, non-speculative) load and store operations, and only those among them that didn’t cause any fault.

Please refer to Chapter 19.6 of Volume 3 of Intel® 64 and IA-32 Architectures Software Developer’s Manual.

Thanks,

Stas

  • Thanks for your comments, but I beg to differ. The first event counts only page-walks after misses in all TLBs (a page walk is not triggered if L1-TLB miss hits in L2 TLB). On the other hand it seems like the second event can possibly include both L1 and L2 TLB misses. I also made sure that s/w prefetches are not being used and all memory is pre-faulted in. Only thing that is probably remained is DTLB misses due to missspeculation. However, I thought since I counted only "completed" page walks not any page walks that is also taken care off. Am I missing something? – Arka Jul 31 '13 at 18:16
  • Well, "completed" here effectively means "any that missed all TLB levels", and the difference in your case may be explained by misspeculation, HW prefetching, and probably some inaccuracy (though I cannot prove it) in the first event (as it is not a precise event, while the other two are precise). – user2638717 Aug 01 '13 at 11:23