6

I've got one very specific problem with a project that has been haunting me for days now. I have the following Verilog code for a RAM module:

module RAM_param(clk, addr, read_write, clear, data_in, data_out);
    parameter n = 4;
    parameter w = 8;

    input clk, read_write, clear;
    input [n-1:0] addr;
    input [w-1:0] data_in;
    output reg [w-1:0] data_out;

    reg [w-1:0] reg_array [2**n-1:0];

    integer i;
    initial begin
        for( i = 0; i < 2**n; i = i + 1 ) begin
            reg_array[i] <= 0;
        end
    end

    always @(negedge(clk)) begin
        if( read_write == 1 )
            reg_array[addr] <= data_in;
        if( clear == 1 ) begin
            for( i = 0; i < 2**n; i = i + 1 ) begin
                reg_array[i] <= 0;
            end
        end
        data_out = reg_array[addr];
    end
endmodule

It behaves exactly as expected, however when I go to synthesize I get the following:

Synthesizing Unit <RAM_param_1>.
    Related source file is "C:\Users\stevendesu\---\RAM_param.v".
        n = 11
        w = 16
    Found 32768-bit register for signal <n2059[32767:0]>.
    Found 16-bit 2048-to-1 multiplexer for signal <data_out> created at line 19.
    Summary:
    inferred 32768 D-type flip-flop(s).
    inferred 2049 Multiplexer(s).
Unit <RAM_param_1> synthesized.

32768 flip-flops! Why doesn't it just infer a block RAM? This RAM module is so huge (and I have two of them - one for instruction memory, one for data memory) that it consumes the entire available area of the FPGA... times 2.4

I've been trying everything to force it to infer a block RAM instead of 33k flip flops, but unless I can get it figured out soon I may have to greatly reduce the size of my memory just to fit on a chip.

stevendesu
  • 15,753
  • 22
  • 105
  • 182
  • 2
    Have you reviewed the datasheets for your FPGA to see what kind of rams it offers? There's probably application notes about inferring rams from your FPGA vendor as well. Off top of my head I think that most rams probably don't have a 'clear' functionality that wipes the whole ram, so maybe you could try removing that. – Tim Dec 18 '13 at 06:04
  • The synthesizer can't "infer" a structure that doesn't physically exist on your FPGA, and different FPGAs have different RAMs. This isn't a Verilog problem, you need to read the manuals for your FPGA. –  Dec 18 '13 at 12:04
  • Sorry I update new answer, this is BRAM , not distributed RAM. Really sorry about it. – Khanh N. Dang Dec 20 '13 at 03:20

1 Answers1

5

I just remove something your code, the result like this:

 module RAM_param(clk, addr, read_write, clear, data_in, data_out);
parameter n = 4;
parameter w = 8;

input clk, read_write, clear;
input [n-1:0] addr;
input [w-1:0] data_in;
output reg [w-1:0] data_out;

// Start module here!
reg [w-1:0] reg_array [2**n-1:0];

integer i;
initial begin
    for( i = 0; i < 2**n; i = i + 1 ) begin
        reg_array[i] <= 0;
    end
end

always @(negedge(clk)) begin
    if( read_write == 1 )
        reg_array[addr] <= data_in;
    //if( clear == 1 ) begin
        //for( i = 0; i < 2**n; i = i + 1 ) begin
            //reg_array[i] <= 0;
        //end
    //end
    data_out = reg_array[addr];
end
endmodule  

Init all zeros may dont't need code, if you want to init, just do it:

initial
begin
    $readmemb("data.dat", mem);
end

Then the result that I got from ISE 13.1

Synthesizing (advanced) Unit <RAM_param>.
INFO:Xst:3231 - The small RAM <Mram_reg_array> will be implemented on LUTs in order to maximize performance and save block RAM resources. If you want to force its implementation on block, use option/constraint ram_style.

    -----------------------------------------------------------------------
    | ram_type           | Distributed                         |          |
    -----------------------------------------------------------------------
    | Port A                                                              |
    |     aspect ratio   | 16-word x 8-bit                     |          |
    |     clkA           | connected to signal <clk>           | fall     |
    |     weA            | connected to signal <read_write>    | high     |
    |     addrA          | connected to signal <addr>          |          |
    |     diA            | connected to signal <data_in>       |          |
    |     doA            | connected to internal node          |         

Update here!: Strong thanks to mcleod_ideafix Sorry about forgot your question: it's block RAM, not distributed. For block RAM, you must force it: Synthesis - XST -> Process Properties -> HDL option -> RAM style -> Change from auto to Block. The result will be this:

Synthesizing (advanced) Unit <RAM_param>.
INFO:Xst:3226 - The RAM <Mram_reg_array> will be implemented as a BLOCK RAM, absorbing the following register(s): <data_out>
    -----------------------------------------------------------------------
    | ram_type           | Block                               |          |
    -----------------------------------------------------------------------
    | Port A                                                              |
    |     aspect ratio   | 16-word x 8-bit                     |          |
    |     mode           | read-first                          |          |
    |     clkA           | connected to signal <clk>           | fall     |
    |     weA            | connected to signal <read_write>    | high     |
    |     addrA          | connected to signal <addr>          |          |
    |     diA            | connected to signal <data_in>       |          |
    |     doA            | connected to signal <data_out>      |          |
    -----------------------------------------------------------------------
    | optimization       | speed                               |          |
    -----------------------------------------------------------------------
Unit <RAM_param> synthesized (advanced).

End of Update

I recommend you read xst user guide for RAM sample code and the device data sheet. For example, in some FPGA LUT RAM: the reset signal is not valid. If you tried to reset it, the more logic module to reset must be integrate it. It leads to D-FF instead of RAM. The Reset signal will auto-assign to system reset.

In case of Block RAM (not LUT RAM), I prefer to specific depth/data-width or core generation or call it directly from library. More source code for general usage (ASIC/FPGA) can be found here: http://asic-world.com/examples/verilog/ram_dp_sr_sw.html

Khanh N. Dang
  • 906
  • 1
  • 9
  • 18
  • 1
    I just wanted to point out that you have replaced a hardware reset with a run-time load of the RAM. I think that's exactly the right thing to do since a generic RAM typically will not allow an asynchronous reset of all cells, and `initial` blocks are not generally synthesizable. –  Dec 18 '13 at 12:01
  • This fixed everything! The clear bit was actually something in our professor's example code. I'll have to let him know it broke everything. – stevendesu Dec 18 '13 at 15:28
  • If there is a "clear", it usually implemented as `if (clear==1'b1) begin ... end else if (read_write==1'b1) begin ... end else begin ... end`. Also, `negedge(clk)` is normally written as `negedge clk` – Greg Dec 18 '13 at 18:17
  • @JoeHass: I know the initial is not synthesizable, but initial with zero is not thing more in RAM design, it just for ROM init . I just want make it shorter :D. I try to give a reset but it was not synthesized as RAM (just array of FF). As I understand, RAM use system reset, not user-defined, because we don't need clear the RAM in running-times. If system do that, I will define it as registers. – Khanh N. Dang Dec 19 '13 at 05:16
  • 1
    Your result from ISE indicates that distributed RAM has been used, not block RAM! – mcleod_ideafix Dec 19 '13 at 23:59
  • @mcleod_ideafix: Thank you about your comment, I really forgot it. I updated in my answer now!. – Khanh N. Dang Dec 20 '13 at 03:13