0

while reading this lecture I have read this slide enter image description here

and can't imagine why forwarding to Id stage is pointeless while I saw other resource with the following design which does the opposite of proposed design

enter image description here

and I want comprehend if there is good or bad approach regarding this concept as I think both of can work while in most references and text books seem to be like the first one but they need to forward to ID stage when branch hazards occur and I wonder why the other forwards needs to be in EX stage and just move all of them to ID with branch forwards Ruther than adding extra multiplexers and signals

enter image description here

source 1 : https://www.cs.cornell.edu/courses/cs3410/2012sp/lecture/10-hazards-i.pdf

source 2 : https://faculty.kfupm.edu.sa/coe/mudawar/coe233/lectures/13-PipelinedProcessorDesign.pdf

source 3: Harris book and same design in Hennessy book

Hussien Mostafa
  • 159
  • 2
  • 18
  • 1
    ID reads from the register file, but doesn't do anything with it yet, just sets it up for EX. If EX is going to need a value, just forward to EX. (WB can get data to ID via the register file, but we don't call that forwarding, just a register file that can pass input to output in the same cycle.) – Peter Cordes Apr 26 '23 at 09:12
  • 1
    For branches, [How does MIPS I handle branching on the previous ALU instruction without stalling?](https://stackoverflow.com/q/56586551) explains how MIPS I evaluates branch conditions in EX while still keeping branch latency down to 1 cycle (EX has branch results ready in the first half-cycle, IF doesn't need an address until the second half-cycle). If you had to forward to ID, you'd need to stall on common patterns like `slt`/ `bnez` (which implements a `blt` pseudo-instruction). Other pipeline designs are possible but worse, if you can build an IF that only needs a half cycle. – Peter Cordes Apr 26 '23 at 09:15
  • 2
    By design, a register file read (ID stage) should be able to deliver the up-to-date value for a register that is written in the same cycle (by WB). (Some designs deliver the WB value at the very beginning of the cycle, while the read takes place in the second half of the same cycle.) If RF cannot deliver WB values in the same cycle, then it would typically have an internal forward inside the register file itself, and if neither of those, then, yes, an external forward from WB to ID would be needed (even though there already is a datapath that delivers WB output to the register file). – Erik Eidt Apr 26 '23 at 14:56
  • 2
    WB output is ready at the very beginning of the cycle — in fact it is captured in the pipeline register at the end of the prior cycle, and that value doesn't need any computation (unlike EX which needs the whole cycle for certain ALU calculations, or ID which needs at least the latter half of the cycle to look up register values).. So, WB is different from all the other stages — it does no actual work, so is very fast. The actual work happens inside the register file, doing the write. – Erik Eidt Apr 26 '23 at 15:02
  • 2
    Forwarding from EX to ID would require a stall cycle when there is a data hazard, because the ID cycle for the 2nd instruction has already executed during the same cycle as the EX of the 1st instruction. The best solution here is forward from end of EX to beginning of EX, and ignore the ID stage results for that hazard, rather than rerunning the ID stage, which would add a cycle. – Erik Eidt Apr 26 '23 at 18:28

1 Answers1

0

as erik pointed out in comments and the answer I could reach that the textbooks assumes that register file can read and write in same cycle (write on negative edge and read on positive edge )and this makes it possible to forward to EX as the data will be ready next cycle .

while the other lectures assumes that register file can only perform reading and writing in full cycle , to get data to EX from registers you will have to wait extra cycle and stall the pipeline , so its easier to forward to Decode stage in this case.

assumption of textbook is possible via verilog as below

module regfile(input  logic        clk, 
               input  logic        we3, 
               input  logic [4:0]  ra1, ra2, wa3, 
               input  logic [31:0] wd3, 
               output logic [31:0] rd1, rd2);

  logic [31:0] rf[31:0];

  // three ported register file
  // read two ports combinationally
  // write third port on rising edge of clk
  // register 0 hardwired to 0
  // note: for pipelined processor, write third port
  // on falling edge of clk

  always_ff @(negedge clk)
    if (we3) rf[wa3] <= wd3;    

  assign rd1 = (ra1 != 0) ? rf[ra1] : 0;
  assign rd2 = (ra2 != 0) ? rf[ra2] : 0;
endmodule

Code taken from repo below

https://github.com/HMS-ELKHOLY/pipline_mips

but the design of the lecture assumes ordinary registers that their values need to be written or read in full cycle

Hussien Mostafa
  • 159
  • 2
  • 18