In the RISC-V spec, there is a CSR called instret
and variants of it for the privilege modes. The spec says that instret
is the number of retired instructions. I wonder about the effect of nop instruction on this register because it is not an operation. And, this brings me to another question too. When there is a case that nop is inserted to flush or stall internally by hardware due to some problems, should this/those nop
instructions also be counted for instret
?
Is that two behaviour officially certain or left to the developer?
Asked
Active
Viewed 111 times
0

Peter Cordes
- 328,167
- 45
- 605
- 847

Ömer GÜZEL
- 69
- 7
-
1*When there is a case that nop is inserted to flush or stall internally* - That's just called a stall, not an actual NOP. Some computer science textbooks talk about NOPs that *would* be needed in the actual machine code for a hypothetical CPU that couldn't detect hazards itself, or use that to describe a stall going through an in-order pipeline. But I'd be *very* surprised if any real CPU counted those as instructions. At that point it would just be a cycle counter, not instructions, useless for measuring IPC (instructions per cycle), i.e. how much you stall. – Peter Cordes Mar 06 '23 at 06:34
-
1As for counting true `nop`s as retired, unless the spec says one way or another, either design is plausible. But counting them as instructions is more likely; x86 CPUs do count `nop`s in their HW perf counters. – Peter Cordes Mar 06 '23 at 06:36
-
2adding to the above ^. A hardware stall is handled by something called a `Hazard Detection Unit` in `RISC-V` and typically causes a flush, which is an effective nop as it sets all control signals to 0, but is not an instruction in itself, but rather, you can think of it that the remnants of whatever instruction happened to be there gets sent forth in the pipeline. It does not affect the instruction count but does affect the CPI. – user18348324 Mar 06 '23 at 06:36
-
Thank you for both detailed answers. @PeterCordes I wanted to understand one point. when the implementation is different as counting nop to instret or not, Doesn't it cause a problem between two different RV64GC real cores? I mean both CPUs may run the Operating system but in some cases, it runs differently from each other. Should not they behave the same? – Ömer GÜZEL Mar 06 '23 at 22:02
-
1I said it might be possible if the spec didn't forbid it (to not count every NOP), but I think most designers would agree with you that it would make sense to count them if it can be done cheaply (transistors / power). Performance counters aren't necessary for correctness, and some *may* vary by microarchitecture. Also, reading the `instret` CSR presumably happens when an instruction executes, and that will be some distance ahead of retirement that depends on the microarchitecture. (e.g. out-of-order exec with a big buffer.) – Peter Cordes Mar 07 '23 at 00:42
-
1If a CPU can handle NOPs without having to spend retirement bandwidth on them (discarding them earlier in the pipeline, e.g. treating them as the tail of the previous instruction), it would be somewhat appropriate to only count 1 for that. Or more likely you'd have a separate counter for actual retirement slots, vs. *instructions* retired, so you could get a true count of architectural instructions. – Peter Cordes Mar 07 '23 at 00:43