0

It takes up to 40 min before a packet is lost, (at rate of 1 packet every few minutes),

The MCU use Linux kernel 3.18.48,

Using Scope, (on UART's Rx Pin), I can see the packets, (about 15 bytes long), are sent well.
But the read() doesn't return, with any of the packet's bytes, (VMIN = 1, VTIME = 0, configured to return if at least 1 byte is in the Rx buffer),

This code is used in 4 other projects, with different HW Board, and we never saw this issue before.

Can you share ideas of how to tackle such issue? How can I debug the UART driver?

To better understand where the packet got lost,

Thanks,

Logic Analyzer of the Lost Packet

E_UARTDRV_STATUS UartDrv_Open(void *pUart, S_UartDrv_InitData *init_data)
{
    struct termios tty;
    struct serial_struct serial;
 
    /*
     * O_RDWR - Opens the port for reading and writing
     * O_NOCTTY - The port never becomes the controlling terminal of the process.
     * O_NDELAY - Use non-blocking I/O.
     * On some systems this also means the RS232 DCD signal line is ignored.
     * Note well: if present, the O_EXCL flag is silently ignored by the kernel when opening a serial device like a modem.
     * On modern Linux systems programs like ModemManager will sometimes read and write to your device and possibly corrupt your program state. 
     * To avoid problems with programs like ModemManager you should set TIOCEXCL on the terminal after associating a terminal with the device. 
     * You cannot open with O_EXCL because it is silently ignored.
     */
    fd = open(init_data->PortName, O_RDWR | O_NOCTTY | O_NDELAY);
    if (fd == -1) // if there is an invalid descriptor, print the reason {
        SYS_LOG_ERR_V("fd invalid whilst trying to open com port %s: %s\n", init_data->PortName, strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
    if (tcflush(fd, TCIOFLUSH) < 0) {
        SYS_LOG_ERR_V("Error failed to flush input output buffers %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
 
    // Enable low latency...this should affect the file /sys/bus/usb-serial/devices/ttyUSB0/latency_timer
    if (ioctl(fd, TIOCGSERIAL, &serial) < 0) {
        SYS_LOG_ERR_V("Error failed to get latency current value: %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
 
    serial.flags |= ASYNC_LOW_LATENCY;
 
    if (ioctl(fd, TIOCSSERIAL, &serial) < 0) {
        SYS_LOG_ERR_V("Error failed to set Low latency: %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
 
    if (fcntl(fd, F_SETFL, 0) < 0) {
        SYS_LOG_ERR_V("Error failed to set file flags: %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
 
    /* Get current configuration */
    if (tcgetattr(fd, &tty) < 0) {
        SYS_LOG_ERR_V("Error failed to get current configuration: %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
 
    if (cfsetospeed(&tty, init_data->baud) < 0) {
        SYS_LOG_ERR_V("Error failed to set output baud rate: %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
 
    if (cfsetispeed(&tty, init_data->baud) < 0) {
        SYS_LOG_ERR_V("Error failed to set input baud rate: %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
 
    tty.c_cflag |= (CLOCAL | CREAD); /* Enable the receiver and set local mode */
    tty.c_cflag &= ~CSIZE;
    tty.c_cflag |= CS8;         /* 8-bit characters */
    tty.c_cflag &= ~PARENB;     /* no parity bit */
    tty.c_cflag &= ~CSTOPB;     /* only need 1 stop bit */
 
    /*
     * Input flags - Turn off input processing
     * convert break to null byte, no CR to NL translation,
     * no NL to CR translation, don't mark parity errors or breaks
     * no input parity check, don't strip high bit off,
     * no XON/XOFF software flow control
     * BRKINT - If this bit is set and IGNBRK is not set, a break condition clears the terminal input and output queues and raises a SIGINT signal for the foreground process group associated with the terminal.
     * If neither BRKINT nor IGNBRK are set, a break condition is passed to the application as a single '\0' character if PARMRK is not set, or otherwise as a three-character sequence '\377', '\0', '\0'.
     * INPCK - If this bit is set, input parity checking is enabled. If it is not set, no checking at all is done for parity errors on input; the characters are simply passed through to the application.
     * Parity checking on input processing is independent of whether parity detection and generation on the underlying terminal hardware is enabled; see Control Modes. 
     * For example, you could clear the INPCK input mode flag and set the PARENB control mode flag to ignore parity errors on input, but still generate parity on output.
     * If this bit is set, what happens when a parity error is detected depends on whether the IGNPAR or PARMRK bits are set. If neither of these bits are set, a byte with a parity error is passed to the application as a '\0' character.
     */
    tty.c_iflag &= ~(BRKINT | PARMRK | ISTRIP | INLCR | IGNCR | ICRNL | IXON);
 
    /*
     * IGNBRK - If this bit is set, break conditions are ignored.
     * A break condition is defined in the context of asynchronous serial data transmission as a series of zero-value bits longer than a single byte.
     */
    tty.c_iflag |= IGNBRK;
 
    /*
     * No line processing
     * echo off, echo newline off, canonical mode off, 
     * extended input processing off, signal chars off
     */
    tty.c_lflag &= ~(ECHO | ECHONL | ICANON | ISIG | IEXTEN);
 
    /* 
     * Output flags - Turn off output processing
     * no CR to NL translation, no NL to CR-NL translation,
     * no NL to CR translation, no column 0 CR suppression,
     * no Ctrl-D suppression, no fill characters, no case mapping,
     * no local output processing
     *
     * c_oflag &= ~(OCRNL | ONLCR | ONLRET | ONOCR | ONOEOT| OFILL | OLCUC | OPOST);
     */
    tty.c_oflag = 0;
 
    /* fetch bytes as they become available */
    tty.c_cc[VMIN] = 0;
    tty.c_cc[VTIME] = 1; // timeout in 10th of second
 
    if (tcsetattr(fd, TCSANOW, &tty) != 0) {
        SYS_LOG_ERR_V("Error failed to set new configuration: %s\n", strerror(errno));
        return UARTDRV_STATUS_ERROR;
    }
    uartPeripheral.fd = fd;
    return UARTDRV_STATUS_SUCCESS;
}

uint8_t *UartDrv_Rx(S_UartDrv_Handle *handle, uint16_t bytesToRead, uint16_t *numBytesRead)
{
    ssize_t n_read;
    uint16_t n_TotalReadBytes = 0;
    struct timespec timestamp;
    struct timespec now;
    long diff_ms;
    bool Timeout = false, isPartialRead = false;
    
    if (handle == NULL) {
        SYS_LOG_ERR("UartDrv Error: async rx error - peripheral error");
        exit(EXIT_FAILURE);
    }

    if (bytesToRead > sizeof(uartPeripheral.buffer_rx)) {
        ESILOG_ERR_V("Param Error: Invalid length %u, max length %zu", bytesToRead, sizeof(uartPeripheral.buffer_rx));
        *numBytesRead = 0;
        return NULL;
    }

    while(n_TotalReadBytes < bytesToRead && !Timeout) {
        do {
            n_read  = read(uartPeripheral.fd, &uartPeripheral.buffer_rx[n_TotalReadBytes], bytesToRead - n_TotalReadBytes);
            if (isPartialRead) {
                clock_gettime(CLOCK_REALTIME, &now);
                diff_ms = (now.tv_sec - timestamp.tv_sec)*1000;
                diff_ms += (now.tv_nsec - timestamp.tv_nsec)/1000000;
                if (diff_ms > UART_READ_TIMEOUT_MS) {
                    SYS_LOG_ERR("UartDrv_Rx: Error, timeout while reading\r\n");
                    Timeout = true;
                }
            }
        } while ((n_read != -1) && (n_read == 0) && !Timeout);

        if (n_read == -1) {
            if (errno == EINTR) {
                ESILOG_WARN("Uart Interrupted");
                continue;
            }
            ESILOG_ERR_V("Uart Error: [%d, %s]", errno, strerror(errno));
            exit(EXIT_FAILURE);
        }

        n_TotalReadBytes += (uint16_t)n_read;
        if (n_TotalReadBytes < bytesToRead) {
            //SYS_LOG_DBG_V("UartDrv_Rx: couldn't fetch all bytes, read %hu, expected %hu, continue reading %s\r\n", n_TotalReadBytes, bytesToRead, isPartialRead? "During Partial read": "");
            if (!isPartialRead) {
                isPartialRead = true;
                clock_gettime(CLOCK_REALTIME, &timestamp);
            }
        }
    }
    *numBytesRead = n_TotalReadBytes;
    return uartPeripheral.buffer_rx;
}
0andriy
  • 4,183
  • 1
  • 24
  • 37
G. Roby
  • 9
  • 3
  • Your program (which needs to be posted if you expect help on this site) is several layers removed from the UART. See [Linux serial drivers](http://www.linux.it/~rubini/docs/serial/serial.html) Considering the typical user mistakes made when coding for POSIX serial terminals, the problem(s) is/are probably in your code (rather than the OS). IOW there's no need to *"debug the UART driver"* (assuming that the driver is in mainline); instead debug your code first. – sawdust Mar 13 '22 at 02:48
  • Thanks @sawdust, I added my code, of course we can't tell where the bug is. I suspect that some other process reads the packet, in between. Or some other crazy shit happening. For that I need the ability to debug it all the way, from driver level up to the read(), currently I get stuck on this read() – G. Roby Mar 13 '22 at 19:31
  • 1
    Your termios init is decent, but not perfect. CRTSCTS is left to chance. `c_oflag = 0` should be replaced with `c_oflag &= ~OPOST`. You claimed you use *"VMIN = 1, VTIME = 0"*, but the code actually has `VMIN = 0` and `VTIME = 1`. Such a *timed* read combined with ASYNC_LOW_LATENCY is contradictory. You've only posted two procedures. How is **UartDrv_Rx()** used? What is **UART_READ_TIMEOUT_MS**? *"a packet is lost"* -- That's a conclusion, and not a description of an observable symptom. How did you determine that a packet was not received? – sawdust Mar 13 '22 at 21:06

0 Answers0