0

I am trying to write a module to "blur" a raster image - take the average of a 3x3 pixel sliding window across the image. The input image is a series of bytes with RGB values of the pixels, row by row, from bottom to top (similar to bitmap). It selects and outputs the average of the 3x3 window on the positive clock edge, and "moves" the window on the negative clock edge. When it is finished it drives the DONE wire to high.

I come from a software background, and want to know if I'm making any obvious errors/pitfalls. As I am trying to learn Verilog without a physical FPGA, I have simply been using icarus-verilog to test my code, and have no idea whether I am writing my test-benches correctly and whether my code is actually synthesizable. So, I was hoping someone could help take a look or give feedback.

I am also unsure of when/why to use blocking (=) versus non-blocking (<=) assignment. I read online that = is for combination logic and <= for sequential logic. But how do I know when I should be using combinational or sequential logic?

Thank you!

module blur
    #(parameter
        WIDTH  = 640,   // Image width
        HEIGHT  = 426,   // Image height
        INPUTBYTES = WIDTH*HEIGHT*3
    )
    (
      input CLK,          // clock     
      input RST,         // Reset (active low)
      input wire [7 : 0] input_memory [0 : INPUTBYTES-1],// memory to store  8-bit data image
      output reg [7 : 0] output_memory [0 : INPUTBYTES-1],// memory to store  8-bit data image
      output wire DONE     // Done flag
    );

    integer row = 0; // row index of the image
    integer col = 0; // col index of the image
    integer counter = 0;

    reg [7:0]  DATA_R = 0;  // 8 bit Red data
    reg [7:0]  DATA_G = 0;  // 8 bit Green data
    reg [7:0]  DATA_B = 0;  // 8 bit Blue data

    assign DONE = (counter > INPUTBYTES-1);

    always@(negedge CLK) begin

        if (col == WIDTH-3) begin
            col <= 0;
            row <= row+1;
        end
        else begin
            col <= col+1;
        end

        counter <= counter+3;
    end


    always@(posedge CLK) begin
        integer i;
        integer j;

        // Sum up all the R, G, B values of the 3x3 square
        for(i=0; i<3; i=i+1) begin
            for(j=0; j<3; j=j+1) begin
                DATA_R = DATA_R + input_memory[3*(WIDTH*(row+i)+(col+j))+0];
                DATA_G = DATA_G + input_memory[3*(WIDTH*(row+i)+(col+j))+1];
                DATA_B = DATA_B + input_memory[3*(WIDTH*(row+i)+(col+j))+2];
            end
        end

        // Output is the averaged RGB values
        output_memory[counter+0] = DATA_R/9;
        output_memory[counter+1] = DATA_G/9;
        output_memory[counter+2] = DATA_B/9;
    end
endmodule
chris
  • 11
  • Divide by 9 is not going to synthesize. Division in RTL Verilog has limitations, is a broad topic, a lot has been written about it. Internet search will help. If you need to divide by 9, a separate divide design is needed. – Mikef Sep 15 '22 at 14:21
  • You may run into trouble synthesizing stuff like ` = input_memory[... row... col ...];` That is selecting one 8b value (based on row,col values) from WIDTH*HEIGHT*3 = 921600 total bytes. That is, a 921600 -> 1 multiplexer. And then you do that three times total for each color. That is lots and lots of LUTs. You might be lucky if the tool can map it to BRAM (might have trouble with your row,col math to determine index + registering IOs). But even than thats alot of BRAM too... – Julian Kemmerer Sep 15 '22 at 15:16
  • The for loops may not give you the expected result. They will synthesize into 9 combinational sub-circuits that all share the same clock. If the sequence of operations is important, this won't give you what you want. – richbai90 Sep 16 '22 at 03:22
  • To ease timing and combinational explosion, I suggest you pipeline your adder tree (after initial latency, you will still have 1 output per cc). This is a place to start https://github.com/pConst/basic_verilog/blob/master/adder_tree.sv – Fra93 Sep 16 '22 at 07:45
  • Okay, thanks all! As suggested, I'll first read up on how to write synthesizable division and large additions, then come back to this in the future. – chris Sep 16 '22 at 08:00

0 Answers0