-1

can someone help me understand why between line 1 and 3 we don't need forwarding (there is no green arrow as between 1 and 2)

I think we need it because sub uses the value of t0 which add determines and both are doing read and write of that value at same time.(To be precise write for add happens more lately when the clock rises)

Erik Eidt
  • 23,049
  • 2
  • 29
  • 53
  • This is really a computer architecture question, not a computer programming question. – Raymond Chen Jan 27 '21 at 00:13
  • 1
    I’m voting to close this question because it is not a programming question as defined in the Help Center. – Rob Jan 27 '21 at 00:26
  • We have computer architecture and cpu architecture tags, so this cannot be off topic just by being in one (or both) of those categories. – Erik Eidt Jan 27 '21 at 01:40
  • @Rob: unless / until we have a computer-architecture Stack Exchange site, Stack Overflow seems like a good place for questions about how CPUs work. More complex ones about real CPUs are sometimes the answer to real-world performance questions (and as such are useful duplicate targets), so there isn't any obvious place to draw a hard cutoff between what belongs on SO in the computer-architecture tag and what doesn't. There also is some interest in reading / answering such questions, which doesn't automatically justify them, but is why they don't usually get closed for the reason you cite. – Peter Cordes Jan 27 '21 at 02:56
  • @Rob: Arguably they could belong on computer-science.SE (especially the ones about simplified or simple [classic RISC pipelines](https://en.wikipedia.org/wiki/Classic_RISC_pipeline)) or maybe electronics.SE. But some modern real-world CPUs do have scalar pipelines like classic 5-stage RISCs, besides of course MIPS R2000 and other historical CPUs that really were classic RISCs. – Peter Cordes Jan 27 '21 at 02:58

1 Answers1

1

You are correct that in the third instruction (sub), has already read an incorrect (e.g. stale) value in decode stage, and thus requires mitigation such as forwarding.

In fact, that sub instruction has read two incorrect (stale) values, one for the first operand, t0, and one for the second operand, t3, as that register is updated by the immediately prior instruction.

The first actual register update (of t0 by add) is available in cycle 5 (1-based counting), yet the decode of the sub happens in cycle 4.  A forward is required: here it could be from the W stage of the add to the ALU stage of the sub -or- it could be done from the M stage of the add to the D stage of the sub.

Only in the next cycle after (4th instruction, not shown) could the decode obtain the proper up-to-date value from the earlier instruction's W stage — if the W stage overlaps with a subsequent instruction's D stage, no forward is necessary since the W stage finishes early in the cycle and the D stage is able to pick up that result.


There is also a straightforward ALU-ALU dependency, a (read-after-write) hazard, on t3 between instruction 2 (the writer) and instruction 3 (the reader) that the diagram does not call out, so that is good evidence that the diagram is incomplete with respect to showing all the hazards.


Sometimes educators only show the most clear example of the read-after-write hazard.  There are many other hazards that are often overlooked.

Another involve load hazards.  Normally, a load hazard is seen as requiring both a forward and a stall; this if there is a use of the load result done in the next instruction at the ALU.  However, if a load instruction is succeeded by a store instruction (storing the loaded data), a forward from M (of load) to M of store can mitigate this hazard without a stall (much the same way that X to X forward can mitigate and ALU dependency hazard).

So we might note that a store instruction has two register sources, but the register for the value being stored isn't actually needed until the M stage, whereas the register for the base address computation is needed in the X (ALU) stage.  (That makes store somewhat different from, say, add which also has two register sources, in that there both are needed for the X stage.)

Erik Eidt
  • 23,049
  • 2
  • 29
  • 53