Missing Last Byte On Circular Buffer Burst

Question

Thanks to help from another user, I have implemented a simple circular buffer to read in some UART data, do some minor processing, then transmit on another UART. I receive a stream of 10-20 6-byte packets based on some user interaction with an outside device. I have a known SOF that is one of two values, four data bytes, then a EOF byte. The first XX packets always have the first SOF value, and the last packet always has the other known SOF value. The issue is that when I get that second SOF value, the IRQ doesn't hit again. As such, I miss the EOF byte in my state machine and the circular buffer pointer doesn't increment, so this last packet doesn't get transmitted. The correct final packet is loaded in the circular buffer (I don't store the EOF bytes), but because my head and tail indices are equal (because I don't proceed to the case STATE_WAIT_EOF state on the final byte, which would increment the head buffer), the head index doesn't get incremented, and the packet doesn't get processed/transmitted.

Here is the state machine in the IRQ handler:

void USART2_IRQHandler(void)
{
    uint8_t Curr_Byte = *RX_Touch_Data;

    switch(Pkt_State)
    {
        case STATE_WAIT_SOF:
        {
            Data_Indx = 0;

            if(( Curr_Byte == TD_PKT) || ( Curr_Byte == LO_PKT ))
            {
                Msg_Buff[Buff_Head_Indx].RX_Data[Data_Indx] = Curr_Byte;                
                ++Data_Indx;
                Pkt_State = STATE_RX_DATA;
            }
         }
         break;

        case STATE_RX_DATA:
        {
            Msg_Buff[Buff_Head_Indx].RX_Data[Data_Indx] = Curr_Byte;
            ++Data_Indx;

            if( Data_Indx == DATA_SIZE )
            {
                Pkt_State = STATE_WAIT_EOF;

                if (Msg_Buff[Buff_Head_Indx].RX_Data[0] == LO_PKT)
                {
                    LO_Pkt = TRUE;  //Set flag indicating got full data buff of last pkt
                }
            }
        }
        break;

        case STATE_WAIT_EOF:
        {
            if(LO_Pkt == TRUE)
            {
                DataFull = TRUE;    //Set breakpoint here, never hits
            }
        
            if( Curr_Byte == EOF_PKT )
            {
                uint8_t New_Buff_Head_Indx = ( Buff_Head_Indx + 1 ) % NUM_MESSAGES;     

                if( New_Buff_Head_Indx != Buff_Tail_Indx)
                {
                    Buff_Head_Indx = New_Buff_Head_Indx;
                }
                else
                {
                    //No space; msg will be lost
                }
            }

            Pkt_State = STATE_WAIT_SOF;
        }
        break;

        default:
        break;

    }

    HAL_UART_IRQHandler(&huart2);                       //Let HAL clean up flags

    HAL_UART_Receive_IT( &huart2, RX_Touch_Data, 1 );   //Get next byte
}

Of course, I also call HAL_UART_Receive_IT function once before entering my main while loop to kick off the process.

There isn't anything "special" about the last packet the device sends. I suspect if I could force the device to ONLY send the one type of packet, I'd find the same situtation: the final packet still wouldn't get sent (unfortunately, I can't do that). Also, note that if I trigger a new burst of packets via a new user interaction, the last packet of the FIRST burst gets "forced" out (which makes sense, as the head index gets incremented). So I could have ten user interactions with ten bursts of packets, and all will be fine, except the last one, which would be missing the final packet.

Does anyone see what I'm missing? I hope my explanation makes sense.

EDIT: Just for test, I blew out my circular buffer size to 10x what I expect from say, ten bursts. Just wnated to eliminate the possibility of any wraparound index issues. No change.

I've temporarily implemented a "patch" where I DON'T look for the EOF byte on the last packet. This works, but is a hack at best and I'd like to understand why it's not working without this.

EDIT2:

HAL_StatusTypeDef HAL_UART_Receive_IT(UART_HandleTypeDef *huart, 
uint8_t *pData, uint16_t Size)
{
   /* Check that a Rx process is not already ongoing */
   if (huart->RxState == HAL_UART_STATE_READY)
   {
       if ((pData == NULL) || (Size == 0U))
   {
       return HAL_ERROR;
   }

   /* Set Reception type to Standard reception */
   huart->ReceptionType = HAL_UART_RECEPTION_STANDARD;

   if (!(IS_LPUART_INSTANCE(huart->Instance)))
  {
     /* Check that USART RTOEN bit is set */
     if (READ_BIT(huart->Instance->CR2, USART_CR2_RTOEN) != 0U)
     {
        /* Enable the UART Receiver Timeout Interrupt */
        ATOMIC_SET_BIT(huart->Instance->CR1, USART_CR1_RTOIE);
     }
   }

   return (UART_Start_Receive_IT(huart, pData, Size));
 }
 else
 {
    return HAL_BUSY;
 }
}

I've set flags on the HAL_BUSY and HAL_ERROR conditions, they are never hit.

The only other potentially relevent code is where the tail index is examined for available packets:

if( Buff_Head_Indx != Buff_Tail_Indx )                                   
//There are available packets to send
{
    Result = Process_Touch_Data();
    Buff_Tail_Indx = ( Buff_Tail_Indx + 1) % NUM_MESSAGES;              
}

Are you sure it's being sent at all? Are you sure you're receiving it at all? Are you sure it's not falling into this silent swallow `//No space; msg will be lost` case? Are you even printing the whole data stream on both the sender/receiver side to better debug the issue? — Shark, May 17 '23 at 14:36
Yes, I have used a packet snooper to verify the device is sending the byte. I am NOT receiving it; that's the issue. The "case STATE_WAIT_EOF" switch is never hit after the "LO_Pkt = TRUE" flag is set (until the next burst). — nobby, May 17 '23 at 15:05
So, obviously the problem is not in the code you posted, but rather the code you didn't. The expected bytes are being sent, but the last byte is not being received. Where's the receiving code? — Shark, May 17 '23 at 16:26
The UART RX code is directly from the STM32L0x0 HAL. I've posted it above. — nobby, May 17 '23 at 16:37
Think of what happens for the very first character you receive. What is the content of `*RX_Touch_Data` when `USART2_IRQHandler()` is first called? Be aware that you have not called `HAL_UART_IRQHandler()` yet. — pmacfarlane, May 17 '23 at 17:00
Thank you @pmacfarlane once again. This is a good case where me reviewing the documentation would have helped. I thought the USART2_IRQHandler was responsible for actually receiving in the byte, and HAL_UART_IRQHandler was essentially flag clean-up. Clearly I had these (basically) reversed. — nobby, May 17 '23 at 18:26

pmacfarlane · Accepted Answer · 2023-05-17T21:07:59.600

Problem

Your data handling is all delayed by one byte. Here is the sequence of events:

You call HAL_UART_Receive_IT( &huart2, RX_Touch_Data, 1 ); This enables the receive interrupt on the UART.

A byte (let's say SOF) is received on the UART. USART2_IRQHandler() gets called. You read a byte from *RX_Touch_Data, which is not currently set to anything meaningful, and process that.

You then call HAL_UART_IRQHandler() This reads the SOF byte from the UART and stores it in *RX_Touch_Data.

When the next byte is received (let's say 0x53), your state machine processes the SOF, and then the HAL stores the 0x53 in *RX_Touch_Data. Ad-inifinitum. You're always one byte behind.

After the final EOF comes in, it is buffered in *RX_Touch_Data, and since no more bytes appear, it is not processed by the state machine.

Suggested solution

Read the byte from UART2 yourself in your interrupt handler, something like:

uint8_t Curr_Byte = UART2->DR & 0xff;

and do not call HAL_UART_IRQHandler().

I'd also recommend just enabling the receive interrupt yourself. Then you don't need to call HAL_UART_Receive_IT() any more. You just get an interrupt every time a byte is received. This might be enough:

/* Enable the UART Data Register not empty Interrupt */
__HAL_UART_ENABLE_IT(&huart2, UART_IT_RXNE);

Ah, of course. Alternately, moving HAL_UART_IRQHandler() to the the top of USART2_IRQHandler() (above even uint8_t Curr_Byte = *RX_Touch_Data of course) would accomplish the same, no? I know HAL has issues and there are many who recommend steering clear when possible. It just seems that when issues are encountered, it's easier to find support than with a roll-your-own solution. — nobby, May 17 '23 at 18:12
Moving it above would probably solve it, but it just seems so pointless buffering everything through `RX_Touch_Data`, and having to call `HAL_UART_Receive_IT()` for every byte. I would definitely not use the HAL at all for this UART (other than maybe initialising it). — pmacfarlane, May 17 '23 at 18:17

Luis Colorado · Answer 2 · 2023-05-29T09:42:02.030

IMHO you have a poorly defined set of states and so, a bad automaton to implement the protocol.

You have defined in your automaton only the correct input case, and as a general rule, you avoid to move to a different state in case the input is not the one you accept. This leads you to a state in which you can receive a misformed packet, line noise of simply bad input, and lock your receiver from that point on.

A protocol automaton to accept input must consider all input states, and the transitions for every possible input, even in the case that you think it will never happen. When you analise your input so systematically, you cover all the cases, and your automaton recovers correctly from syntax errors.

I should define the automaton as a table, one row for each state, one column for each possible input. I would put on each cell of the table, the next state on that input, and also any action in case I have recognised something, This can be implemented with a pointer to a routine that does the work.

Consider all combinations... in the code you post, you only consider that the data comes in some (correct) sequence, but this can be not the case. If you check your automaton, there's a possibility that you receive a TD_PKT, LO_PKT and switch to state STATE_RX_DATA. If then it happens some error, like you have umplugged the connector and plugged it again, you will lose a full packet before you recognice a new, incorrect packet (because you will have the packet contents up to the unplug, added with the data comming from the next packet, and that should be incorrect) there must be always some framing error state from which, on any input (except the start of a new frame) you go to the idle state (and in case of the start of a new frame, you get inside a new packet again)

In my opinion, you should name your states based on what has been recognized upto the considered state, and not on what is being expected, You should be capable to start (and recover appropiately) on whatever state you may be:

Idle state. Starting state. You have not received anything valid yet. In this state:
- TD_PKT_SOF switches you to IN_DATA_TD_PKT state. It means you have received a valid header char and you prepare to start receiving buffer data. You should empty the buffer (and see below, add the start character to the buffer, optionally), as you will start adding data bytes.
- LO_PKT_SOF switches you to IN_DATA_LO_PKT state. This recognizes an error (a partially received packet with no end of packet delimiter) You can signal the error to the user (possibly including the partially received packet) You should empty the buffer (and as described above, add it to the buffer optionally), and you will start adding new data bytes.
- END_PKT ignore this character in this state. You have not started any packet, so receiving this byte represents a spurious byte on the line.
- anything else, remain in Idle state. You treat this input exactly as END_PKT in this state. It is described separate to maintain the table format.
IN_DATA_TD_PKT state. In this state, if you receive:
- TD_PKT_SOF you have received the start of a new packet. This should be considered an error. You should transmit to the user an error, possibly passing up the malformed buffer contents. You must also empty the buffer, because you are also starting the collection of a new packet, as the bytes coming next should start filling a new buffer contents. You will switch to IN_DATA_TD_PKT (which means no state change as the packet type now is the same as before).
- LO_PKT_SOF same as above, but you start a different packet type, in this case you can signal an error to the user as above, you should empty the buffer, but you must switch now to a different state IN_DATA_LO_PKT, as now the packet type is different.
- END_PKT return a buffer to the user of a TD packet type. Switch to Idle state, and optionally you can empty the buffer contents. Emptying the buffer is done on start of a new packet, so it's redundant to be done here also.
- any other thing, add it to the buffer and don't switch state.
IN_DATA_LO_PKT. If you receive:
- TD_PKT_SOF you should declare an error (this can be done depending if the buffer is partially filled or not, so you will not get spurious errors due to cable umplugging) You need to switch to state to IN_DATA_TD_PKT, as the packet type has changed.
- LO_PKT_SOF you should remain in this state, but you should signal an error on a non empty buffer (as done in state IN_DATA_TD_PKT state, but symmetrically)
- END_PKT return a buffer to the user of a LO packet type. Switch to Idle state, and optionally empty the buffer. Emptying the buffer here is not necessary, as you always empty it when starting a new packet.
- any other thing, add it to the buffer.

As you see in this description, all states and input data are considered. You can implement this as a nested double switch (in any order) or as a table controlling a loop in which the interrupt routine is called for each received character, the character is mapped to its character class and then processed by finding the table entry corresponding to state and input character class, then the next state is computed and assigned, and the actions pointer is checked and, if not NULL, executed to perform the actions corresponding to it. The table approach results, in general in more compact code, and allows you to share the same code to several cases very easily. If I had to implement this, I'd write an input mapping table, which maps all the bytes (256 entries) into character classes, which are the ones considered as input. I'd use a compact 3x4 table (three states, four input classes) and based on that, start testing the code. In case on errors you need some way to recover your automaton state, more if you are implementing this as some kernel driver. The state also allows you to distinguish what kind of packet you have received. IMHO, if you designed two different end of packet bytes instead of two different start of packet, you would not need to maintain double states and you will get a more compact automaton.... but that's a requirement of the problem and normally not a place in where you are free to change design.

Another thing is that, as you see, I have not included the control characters in the buffer, but you can do it if desired.... in that case, apart of emptying the buffer when you start gathering data, you have to add the first start of packet to the buffer (the one that corresponds, this is not necessary if you tag somehoe the packet type when you return it to the user) and you will need to add the packet end character to the buffer, once a full packet is recognized.

Missing Last Byte On Circular Buffer Burst

2 Answers2

Problem

Suggested solution