0

I have been trying to implement a UART in order to communicate between my Lattice MachXO3D board and my computer. At the moment I am attempting to implement the transmission from the FPGA.

Upon testing on the hardware, I encountered a very strange issue. If run normally, it will function for a few seconds and then it will suddenly stop functioning (The CH340 connected to my computer will report it is receiving messages containing 0x0). However, if I embed a logic analyzer onto the FPGA through the Lattice Diamond software, and I run the analyzer, it will function perfectly for an extended period of time.

Sadly, I don't have a logic analyzer, so the embedded logic analyzer is my only chance at knowing what is actually being transmitted.

These are the files related to my implementation:

baud_gen

LIBRARY IEEE; 
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;
-- Generates 16 ticks per bit
ENTITY baud_gen IS
  GENERIC(divider: INTEGER := 13 -- 24M/115200*16
          );
  PORT(
    clk, reset: IN STD_LOGIC;
    s_tick: OUT STD_LOGIC
    );
END baud_gen;

ARCHITECTURE working OF baud_gen IS
  
BEGIN
  PROCESS(clk)
    VARIABLE counter: UNSIGNED(3 DOWNTO 0) := to_unsigned(0,4);
  BEGIN
    IF clk'EVENT AND clk='1' THEN
      IF reset='1' THEN
        s_tick <= '0';
        counter := to_unsigned(0,4);
      ELSIF counter=to_unsigned(divider-1,4) then
        s_tick <= '1';
        counter:= to_unsigned(0,4);
      ELSE
        s_tick <= '0';
        counter := counter + 1;
      END IF;
    END IF;
  END PROCESS;
END working;

rs232_tx

LIBRARY IEEE; 
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;

ENTITY rs232_tx IS
  PORT(clk: IN std_logic;
       tx: OUT std_logic;
       rst: IN std_logic;
       fifo_empty: IN std_logic;
       fifo_RdEn, fifo_RdClock: OUT std_logic;
       fifo_data: IN STD_LOGIC_VECTOR(7 DOWNTO 0);
       Q: OUT STD_LOGIC_VECTOR(1 DOWNTO 0));
END rs232_tx;

ARCHITECTURE working OF rs232_tx IS
  TYPE state IS (idle,start,data);
  SIGNAL tx_pulse: STD_LOGIC := '1';
  SIGNAL s_tick: STD_LOGIC;
  SIGNAL pr_state, nx_state: state := idle;
  SIGNAL data_val: std_logic_vector(7 DOWNTO 0):=(others=>'0');
  SIGNAL data_count: unsigned(2 DOWNTO 0):=to_unsigned(0,3);
BEGIN
  process(s_tick, rst)
    VARIABLE count: unsigned(3 DOWNTO 0):= to_unsigned(0,4);
  BEGIN
    IF rising_edge(s_tick) THEN
      count := count + to_unsigned(1,4);

      IF count = to_unsigned(15,4) THEN
        tx_pulse <= '1';
      ELSE
        tx_pulse <='0';
      END IF; 
    END IF;
  END PROCESS;

  process(tx_pulse,rst)
  BEGIN
    IF rising_edge(tx_pulse) THEN
      IF rst='1' THEN
        pr_state <= idle;
        data_val <= (others=>'0');
        data_count <= to_unsigned(0,3);
      ELSE
        pr_state <= nx_state;
        CASE pr_state IS
          WHEN idle =>
            data_count <= to_unsigned(0,3);
          WHEN data =>
            data_count <= data_count + to_unsigned(1,3);
          WHEN start =>
            data_val <= fifo_data;
          WHEN OTHERS =>
        END case;
        
      END IF;
    END IF;
  END process;

  process(fifo_empty,rst,data_count,pr_state,data_count)
  BEGIN
    case pr_state is
      when idle =>
        Q <= ('1','1');
        fifo_RdEn <= '0';
        tx <= '1';
        IF fifo_empty='0' AND rst='0' THEN          
          nx_state <= start;
        ELSE          
          nx_state <= idle;
        END IF;
      WHEN start =>
        Q <= ('1','0');
        fifo_RdEn <= '1';
        tx <= '0';
        nx_state <= data;
      WHEN data =>
        Q <= ('0','1');
        fifo_RdEn <= '0';
        tx <= data_val(to_integer(data_count));
        if data_count=to_unsigned(7,3) then
          nx_state <= idle;
        ELSE
          nx_state <= data;
        end if;

    end case;
  END process;
  
  fifo_RdClock <= tx_pulse;
  baud_gen: ENTITY work.baud_gen
    PORT MAP(clk,reset=>'0',s_tick=>s_tick);
END working;

testbench

LIBRARY IEEE; 
USE IEEE.std_logic_1164.ALL;
USE IEEE.numeric_std.ALL;

ENTITY rs232_tx_test IS
  GENERIC(clk_period: TIME := 41666666.7 fs;
          baud_period: TIME := 8680.55556 ns);
END rs232_tx_test;

ARCHITECTURE working OF rs232_tx_test IS
  SIGNAL clk: STD_LOGIC := '0';
  SIGNAL tx, rst, fifo_empty, fifo_RdEn, fifo_RdClock: STD_LOGIC;
  SIGNAL fifo_data: STD_LOGIC_VECTOR(7 DOWNTO 0);
BEGIN
  clk <= NOT clk AFTER clk_period/2;
  rst <= '1', '0' AFTER 100 ns;
  PROCESS
  BEGIN
    fifo_empty <= '1';
    WAIT FOR baud_period;
    fifo_empty <= '0';
    WAIT FOR baud_period*16;
  END PROCESS;

  fifo_data <= ('1','1','0','0','1','0','1','1') WHEN fifo_RdEn='1' ELSE (others=>'0');
  dut: ENTITY work.rs232_tx
    PORT MAP(clk,tx,rst,fifo_empty,fifo_RdEn,fifo_RdClock,fifo_data);
END working;

EDIT I have tested another UART design I found online @ 9600 bps and it fails in the same way. It can send a constant character, in this case 'a', to a terminal on my computer, and then it suddenly stops sending anything. However, if I start listening to the soft logic analyzer I generated in Lattice Diamond, it works without a problem and does not fail.

  • Did you simulate it? – Matthew Taylor Oct 15 '20 at 10:00
  • @MatthewTaylor Of course, under simulation it works perfectly. On the hardware, it fails after about 2 seconds. I have even simulated it for 5 seconds, which took quite a long time with a step size of 1fs, and even then it worked well. The problem is definitely something which occurs after synthesis but not in simulation – Francisco Ayala Le Brun Oct 15 '20 at 10:27
  • I'm sorry - stupid question - there's a testbench. How about STA and/or a gate level simulation? – Matthew Taylor Oct 15 '20 at 10:32
  • Possibly related to the 2-process form of state machine. But also, at `pr_state <= nx_state; CASE pr_state IS` you do realise the CASE sees the OLD value of pr_state, right? –  Oct 15 '20 at 10:55
  • 1
    I note you're using a logic generated clock `s_tick` as an actual clock. This is not recommended in FPGAs as logic generated clocks can cause timing issues. It is much better to use the same clock for all logic and us s_tick like a clock enable instead. – Tricky Oct 15 '20 at 10:56
  • also, are you aware your `count` counter, being a variable is updated immediately and hence the check is done on the new value, rather than the current value? this means the adder is in the compare logic chain. If you used a signal instead, the check path would not have the adder in it, as this purely updates the counter value. (apart from possibly putting the TX pulse in the wrong place, with longer logic chains you reduce the fmax of the circuit). – Tricky Oct 15 '20 at 11:00
  • you also generate a logic clock `tx_pulse` from another logic clock `s_tick`, probably compounding the issue I mentioned above. – Tricky Oct 15 '20 at 11:03
  • The DUT produces a `fifo_rdclock` but the testbench makes no use of it. Im sure the real system likely uses it. – Tricky Oct 15 '20 at 11:05
  • @MatthewTaylor I am using GHDL for simulation and Lattice LSE for synthesis. I am not sure how to get the timing information out of Lattice in order to do so, but once I figure it out I'll try it. – Francisco Ayala Le Brun Oct 15 '20 at 14:12
  • 1
    @BrianDrummond Yes, I do realize it. This is because I want the counter to stay at 0 for one tick after the state is changed to it. Should I have designed it differently? – Francisco Ayala Le Brun Oct 15 '20 at 14:13
  • @Tricky How would using s_tick as a clock enable look like? Would I have my process act on the rising edge of clk and then check if s_tick='1'? – Francisco Ayala Le Brun Oct 15 '20 at 14:14
  • @FranciscoAyalaLeBrun Yes - but you need to make sure s_tick is only 1 clock wide. – Tricky Oct 15 '20 at 14:37

1 Answers1

0

This smells like a classic timing issue.

In the comments, others have explained where there are weaknesses in the code that could indirectly lead to this. I'm going to concentrate on what is occurring.

As you have stated, in simulation your code works as you expect. However, that is only half the story. To create an FPGA bit stream the build tools take your code and several other files, synthesises, and conducts Place & Route. Your timing issue occurs during P&R. This is why your simulation doesn't pick up on any errors, as I'm assuming it's an RTL (pre-place & route) simulation.

During P&R the tools lay the logic in the best way to fit the timing model of the device so all paths meet their timing requirements. The path timing requirements are derived from explicit statements in a Timing Constraints file and inferred from your code (that's why your coding style matters btw).

Once P&R is complete, the tools will put the build artefact through a static timing analyser (STA) tool and report back whether the build fails to meet the timing requirements.

This leads to two questions:

  1. Does the build report a timing error?
  2. Do you have a Timing Constraints file - if you're unsure, the answer is no.

The way to debug your problem is to use the Timing Report generated by Lattice Diamond to see where the failures are. If there are no reported failures it means your timing model is wrong because of a lack of appropriate timing constraints. As a minimum, you will need to constrain all IO in the design and describe all the clocks in your design.

Here is a good document to help you out: https://www.latticesemi.com/-/media/LatticeSemi/Documents/UserManuals/RZ/Timing_Closure_Document.ashx?document_id=45588

The reason that your design works when you use the embedded Logic Analyser is that extra logic has been placed in the design, which changes the timing model. The P&R tools lay out the design differently, and by luck have placed the design in such a way as meet the real timing requirements on it.

As I often say to my Software Engineer colleagues, software languages create a set of instructions, HDL creates a set of suggestions.

Vance
  • 201
  • 1
  • 4
  • I managed to use the FT2232 on the board instead of the CH340, and it works now. The only thing that is different are the pins that I are used, as well as the distance the signal has to travel so I believe it was indeed a timing issue. I will have to find a way to do timing simulations on Linux. – Francisco Ayala Le Brun Oct 18 '20 at 14:15