Division in Verilog and Q factor representation

Question

I am currently working on a design of an algorithm for signal processing. I created a model in software that appears to work fine and I am now trying to translate it to verilog. Below is what I do in the software.

I get a 16 bit input, I do the following

    a. convert the hex to decimal
    b. subtract 32768 from the above
    c. divide the result with 32768
    d. convert the result to a signed number in the Q(4,20) format (4 bits for integer, 20 bits fractional)

For example,

case 1: value is 0x803f (ie, value > 0x8000)

    a. Convert 0x803f to decimal, ie, 32831
    b. 32831 - 32768 = 63
    c. 63/32768 = 0.001922
    d. convert to signed Q(4,20) 0x0007C8
    
    
case 2: value is 0x79fc (ie, value < 0x8000)

    a. Convert 0x79fc to decimal, ie, 31228
    b. 32668 - 32768 = -100
    c. -100/32768 = -0.003051
    d. convert to signed Q(4,20) 0xFFF3B7

I started with the following code to translate the software model to verilog and obtained result don't match what I expected. For Case 1 and Case 2, observed value for Y_DIV in simulation is 0x1. But I expect value that is close to Y (Ox003f for case 1 and 0xff9c for case 2) as 1/32768 is relatively small.

With regards to converting 0x003f and 0xff9c to Q(4,20), if I sign extend 0x003f, how do we achieve the target value of 0x0007C8 ?

I am not sure if the following is the right way to translate my algorithm

module test(
input clk,
input rst,
  input [15:0] a,
  input [15:0] b,
  output reg signed [15:0] y,
  output reg signed [15:0] y_div
);  
 

  always @(posedge clk) begin
    if (rst) begin
        y <= 0;
        y_div <= 0;
    end else begin
        y <= (a-b);
        // since (1/32768 = 0.000030) are we better off using a multiply instead of divide ?  
        y_div <= (a-b)/b; // returns wrong value, ie doesn't match expected value
        // Q(4,20) is yet to be done. First need to get the above working   
    end
  end
  
endmodule

Below is my test bench

module tb_test();
  reg CLK, RST;
  
  reg [15:0] A, B;
  wire [15:0] Y, Y_DIV;
  initial begin
    CLK = 0;
    forever #1 CLK = ~CLK;
  end
  
  initial begin
    RST = 1;
    #10;
    RST = 0;
  end
  
  initial begin
    A = 16'b0;
    B = 16'b0;
  #10;
    A = 16'h803f; // case 1
    B = 16'h8000;
  #10;
    A = 16'h7f9c; // case 2
    B = 16'h8000;    
  end
  
  test dut (.clk(CLK), .rst(RST), .a(A), .b(B), .y(Y), .y_div(Y_DIV));
  
endmodule

Thanks @Mikef, you are right. I am in using 16 bit internal representation and 24 bit output precision. Can you please elaborate "If that is the case you are shifting in zeros, and carrying zeros around" for the case of 0x003f? This is what I might be missing. — user2532296, Mar 06 '23 at 20:35
Thanks @Mikef for all the suggestions. I will have a look into it. — user2532296, Mar 06 '23 at 22:37

Mikef · Accepted Answer · 2023-03-07T20:35:13.357

The Verilog '/' operator, when given integers for n/d, performs integer division which yields the integer quotient and throws the remainder away.
If the quotient is < 1, Verilog shows 0.
This is not directly useful in designs that are interested in fractional quotients.

Lets avoid the '/' operator, use >>> (arithmetic shift, sign extends from the LHS) instead.
Divide by 32768 =2**15 is a shift of 15 bits to the right.

Lets shift the a - b term 20 bits to the left to create a 20 fractional bits representation.

a - b need to be 17 bits for the 16 bit add/sub.

Synchronized the testbench to the clock edge.

The output of this module gets close to your vectors.
(Would need to quantize floating point numbers to get a bit-exact match to all the fixed point)
RTL:

module test
#(
  parameter DI_IN_WIDTH         = 16,
  parameter FRACTIONAL_OUT_BITS = 20,
  parameter OUT_BITS_4_DOT_20   = 24
)
(
  input clk,
  input rst,
  input  [DI_IN_WIDTH - 1:0] a,
  input  [DI_IN_WIDTH - 1:0] b,
  //
  output reg [OUT_BITS_4_DOT_20 - 1 :0] y
);  
  
  localparam SUM_WIDTH           = DI_IN_WIDTH  + 1;
  localparam SUM_PLUS_FRACT_BITS = SUM_WIDTH + FRACTIONAL_OUT_BITS;

  reg signed [SUM_WIDTH - 1:0]           a_minus_b;
  // 17.20 numbers
  reg signed  [SUM_PLUS_FRACT_BITS - 1:0] a_minus_b_w_fract_bits;
  reg signed [SUM_PLUS_FRACT_BITS - 1:0] a_minus_b_w_fract_bits_div;

  
  // a - b
  always @ * begin
    a_minus_b = a - b;
    a_minus_b_w_fract_bits = a_minus_b << FRACTIONAL_OUT_BITS; 
  end
    
  always @(posedge clk) begin
    if (rst) begin
      a_minus_b_w_fract_bits_div <= '0;
    end else begin
      // divide by 2^15
      a_minus_b_w_fract_bits_div <= a_minus_b_w_fract_bits >>> 15;
    end
  end
  
  assign y = a_minus_b_w_fract_bits_div[OUT_BITS_4_DOT_20 - 1 : 0];
  
//   initial begin
//     $display("DI_IN_WIDTH         = %0d",DI_IN_WIDTH);
//     $display("SUM_WIDTH           = %0d",SUM_WIDTH);
//     $display("FRACTIONAL_OUT_BITS = %0d",FRACTIONAL_OUT_BITS);
//     $display("SUM_PLUS_FRACT_BITS = %0d",SUM_PLUS_FRACT_BITS);
//     $display("OUT_BITS_4_DOT_20   = %0d",OUT_BITS_4_DOT_20);
//   end
  
endmodule

Testbench:

module tb_test();

  localparam  DI_IN_WIDTH        = 16;
  localparam FRACTIONAL_OUT_BITS = 20;
  localparam OUT_BITS_4_DOT_20   = 24;    
  
  reg CLK, RST;
  
  reg signed [DI_IN_WIDTH - 1:0] A, B;
  reg signed [OUT_BITS_4_DOT_20 - 1 :0] Y;
  
  initial begin
    CLK = 0;
    forever #1 CLK = ~CLK;
  end
  
  initial begin
    RST = 1;
    repeat(2) @(posedge CLK);
    RST = 0;
  end
  
  initial begin
    $display("Test Starting");
    A <= 16'h803f; // case 1 32831
    B <= 16'h8000; // 32,768
    repeat(3) @(posedge CLK);
    $strobe("  t= %0t, a - b = %0d, Y_hex = %h, Y_dec = %0d, Y_real_FMT_4_20 = %f",
      $time,dut.a_minus_b,Y,Y,$itor(Y)/2**20);
    //
    A <= 16'h7f9c; // case 2
    B <= 16'h8000;      
    repeat(1) @(posedge CLK);
    $strobe("  t= %0t, a - b = %0d, Y_hex = %h, Y_dec = %0d, Y_real_FMT_4_20 = %f",
      $time,dut.a_minus_b,Y,Y,$itor(Y)/2**20);
    //
    A <= 16'h8000; // case 3, max value numerator
    B <= 16'h0000;      
    repeat(1) @(posedge CLK);
    $strobe("  t= %0t, a - b = %0d, Y_hex = %h, Y_dec = %0d, Y_real_FMT_4_20 = %f",
      $time,dut.a_minus_b,Y,Y,$itor(Y)/2**20);
    //
    repeat(1) @(posedge CLK);
    $display("Test Done");
    $finish;    
  end
  
  initial begin
    $dumpfile("dump.vcd"); 
    $dumpvars;
  end
  
  test dut (
    .clk(CLK),
    .rst(RST),
    .a(A),
    .b(B),
    .y(Y)
    );

endmodule

Result:

xcelium> run
Test Starting
  t= 5, a - b = 63, Y_hex = 0007e0, Y_dec = 2016, Y_real_FMT_4_20 = 0.001923
  t= 7, a - b = -100, Y_hex = fff380, Y_dec = -3200, Y_real_FMT_4_20 = -0.003052
  t= 9, a - b = 32768, Y_hex = 100000, Y_dec = 1048576, Y_real_FMT_4_20 = 1.000000
Test Done
Simulation complete via $finish(1) at time 11 NS + 0

Division in Verilog and Q factor representation

1 Answers1