1

I'm developing software for an embedded device (NXP i.MX 8M mini SoC). It is running Linux (currently 5.15.32 kernel) and I'm providing serial I/O over the SoC's built-in USB interface using a multifunction gadget.

The gadget provides an RNDIS network interface and an ACM serial interface. Both work great, but I found that when my app (which periodically writes data to the ACM device, /dev/ttyGS0) terminates, it sometimes takes 16 seconds to close the ACM device's file descriptor.

I have observed that this delay only happens when there is no application reading bytes from the other end of the ACM interface. If there is anything reading those bytes (even a simple cat /dev/ttyACM0 > /dev/null on the connected host), then the close() call completes immediately.

So I'm pretty sure that the delay is because of the data queued in the USB driver. The close() call (I assume) is waiting for the data to be received by the remote end of the connection, and it times out after 16 seconds.

I have already tried to tcflush(fd, TCIOFLUSH) the file descriptor before the close() operation, but it didn't help.

I also tried setting the bit-rate to B0, which is meant to represent a "hangup" event for a modem, thinking that might purge the data, but it didn't change anything.

Here's the (hopefully relevant) part of my USB gadget setup script:

modproble libcomposite
cd /sys/kernel/config/usb_gadget
mkdir g
cd g

echo 0x1d6b > idVendor    # Linux Foundation
echo 0x0104 > idProduct   # Multifunction composite gadget
echo 0x0100 > bcdDevice   # v1.0.0
echo 0x0200 > bcdUSB      # USB 2.0

# Miscellaneous / Wire Adapter Multifunction programming interface
echo 0xEF > bDeviceClass
echo 0x02 > bDeviceSubClass
echo 0x01 > bDeviceProtocol

mkdir -p strings/0x409
echo "my_serial_no"    > strings/0x409/serialnumber
echo "my_manufacturer" > strings/0x409/manufacturer

mkdir -p functions/acm.usb0
mkdir -p configs/c.1
echo 250 > configs/c.1/MaxPower
ln -s functions/acm.usb0  configs/c.1

(creation and configuration of RNDIS function omitted)

ln -s configs/c.1 os_desc

Here's a fragment showing how I open and configure the file descriptor (skipping over the error handling checks):

fd = open("/dev/ttyGS0", O_RDWR | O_NONBLOCK);

struct termios settings;

tcgetattr(fd, &settings);

cfsetospeed(&settings, B1000000);  // 1Mbps

settings.c_cflag &= ~(CSTOPB |  // 1 stop bit
                      CSIZE  |  // Clear previous bit-size bits
                      PARENB);  // Don't use parity checking
settings.c_cflag |= (CS8 |      // 8 data bits
                     CLOCAL);   // Ignore modem control lines
settings.c_iflag &= ~(IGNBRK |  // Don't ignore break
                      BRKINT |  // Don't generate SIGINT on BRK
                      PARMRK |  // Don't mark framing/parity errors
                      ISTRIP |  // Don't strip the 8th bit
                      INLCR  |  // Don't translate NL to CR
                      IGNCR  |  // Don't ignore CR
                      ICRNL  |  // Don't ignore NL
                      IXON);    // Don't use XON/XOFF flow control
settings.c_oflag &= ~OPOST;     // Don't use implementation-defined output processing
settings.c_lflag &= ~(ECHO   |  // Don't echo input characters
                      ECHONL |  // Don't echo NL
                      ICANON |  // Don't use canonical mode
                      ISIG   |  // Don't generate signals
                      IEXTEN);  // Don't use implementation-defined input processing

settings.c_cc[VTIME] = 0;  // Disable timeout for non canonical read
settings.c_cc[VMIN]  = 0;  // No minimum chars for non canonical read

tcsetattr(fd, TCSANOW, &settings);
tcflush(fd, TCOFLUSH);

I have tried explicitly enabling and disabling flow control (c_cflag |= CRTSCTS). It changes nothing.

So my questions are:

  • Can my problem be solved by changing the ACM device's configuration or the termios configuration?
  • If not, is there a way to tell Linux that I don't want to wait when closing the FD?
  • If that's not an option, is there a way to tell the ACM driver to flush/discard these buffers so the close operation doesn't have to wait?
  • If none of the above is possible, is there a way my device can know when there is an app connected to the other end of the ACM device so I can tell my app to not send data when there is no receiver?
David C.
  • 777
  • 8
  • 18
  • "*is there a way my device can know when there is an app connected to the other end ...*" -- That sounds a lot like modem and/or HW flow control lines. While reading some [Python doc](https://pyserial.readthedocs.io/en/latest/pyserial_api.html#serial.Serial.open), it mentioned that "Some OS and/or drivers may activate RTS and or DTR automatically, as soon as the port is opened." Maybe you need to test if that occurs with your remote app. If so, then your gadget can enable CRTSCTS in its termios config. – sawdust Aug 17 '23 at 22:07
  • "*So I assume the data we're blocking on is not in the TTY driver, ...*" -- Rather reckless to send everyone on a chase based on your (untested/unproven) assumption. Years ago, back in the Linux 2.6 era, I recall having a problem with a serial port taking ~20 seconds to close. The cause IIRC was a bogus modem control line. BTW please show your code that opens and configures termios for your /dev/ttyGS0. – sawdust Aug 17 '23 at 22:33
  • I've added that code fragment. This is a virtual USB interface. Flow control seems to be handled automatically by the CDC ACM layer, because we never see dropped characters even at very high bit-rates (e.g. 1Mbps) whether or not we configure it at the termios level. – David C. Aug 18 '23 at 13:58
  • I should add that we use the identical open/config sequence for a physical UART (except that it runs at 115200 bps), and there is no problem closing that interface at any time. I assume because a physical UART without flow control will just write the bits whether or not anything is being read from the other end of the link. – David C. Aug 18 '23 at 15:12
  • It might be USB controller (driver) preventing to do something quickly enough, or USB specifications require something, or lock contamination (bug in the code?) is happening... You need to debug yourself first to reduce scope of that. – 0andriy Aug 18 '23 at 19:37
  • Some things to try: (1) Open **/dev/ttyGS0** with O_NOCTTY. (2) Enable CREAD in termios settings. (3) Rebuild kernel with CONFIG_USB_GADGET_DEBUG and maybe even CONFIG_USB_GADGET_VERBOSE. Routine **gs_close()** puts out two messages around a possible 15 sec delay. But if the host has "disconnected" the link, there should be no delay. If you do see the 15 sec delay between those messages then you need to hack that code to report the disconnection status. – sawdust Aug 18 '23 at 21:35
  • `O_NOCTTY` didn't change anything. `CREAD` was already set by the system. I'll have to make some time to change the kernel config to debug the USB gadget driver. – David C. Aug 21 '23 at 14:06

0 Answers0