0

I'm new here, so I apologize if I'm asking the wrong question.

Introduction: I am writing code on Windows server 2022 and I need to get data from the instruction cache on the CPU, are there any ways to do this from Ring 3 or Ring 0 (in assembler or C++)?

My attempts to solve this problem: I tried to get this data from GetLogicalProcessorInformation, but I was only able to get the number of logical processors, etc. information. So the main problem is, how can I get the instruction cache?

Barbosso
  • 11
  • 4
  • You haven't said _why_ you want to do this. – Etienne de Martel Jul 31 '23 at 03:49
  • @EtiennedeMartel This is a small part of my project, I want to make a fast tracer, since the tracer on the event debug loop is too slow for many projects and their optimization and research of compiler optimization possibilities. – Barbosso Jul 31 '23 at 03:53
  • On x86, I-cache is coherent with D-cache, so normal loads will read the same data as instruction fetch. The only thing that can actually read from the I-cache is the CPU's front-end instruction-fetch logic. But you mention getting information about number of logical processors, etc; did you actually want to get information **about** the I-cache (e.g. via `cpuid`), rather than reading data **from** it? The `cpuid` instruction is unprivileged, and will report cache geometry. – Peter Cordes Jul 31 '23 at 04:04
  • @PeterCordes Hello. I would like to read data from it(from I-cache). I also do it on Amd64 architecture – Barbosso Jul 31 '23 at 04:11
  • 2
    `GetLogicalProcessorInformation` can't help if you want to read the bytes of machine code that are currently cached by the I-cache. It reads specifications like cache sizes, not the current values in the microarchitectural state. I don't think there is a way to query which cache lines are currently hot in I-cache (not on x86-64 at least). – Peter Cordes Jul 31 '23 at 04:17
  • You could maybe `rdpmc` a performance counter that was counting an event like `icache_64b.iftag_miss` on Skylake. Like read it before jumping to a line, then read it again after and check if there were any misses. (With `cpuid` or `serialize` before and after maybe, to prevent speculative fetch). But that would mean the cache line would need to hold code that jumps to measurement code, so this doesn't work for checking I-cache state of lines in a normal program, only *maybe* in a microbenchmark / experiment with hand-crafted assembly. – Peter Cordes Jul 31 '23 at 04:19
  • A `perf record` sampling profile will give you an idea of which lines are actually causing uop-cache and/or L1i misses on average over a long run of some code. – Peter Cordes Jul 31 '23 at 04:20
  • @PeterCordes Thanks for the information, even though I have Zen architecture. https://www.amd.com/system/files/TechDocs/55803-ppr-family-17h-model-31h-b0-processors.pdf Based on this book, you're right, I can't get access directly =( I will experiment further! Thanks! – Barbosso Jul 31 '23 at 05:04

0 Answers0