10

I'm trying to choose between pipes and unix sockets for an IPC mechanism.
Both support the select() and epoll() functions which is great.

Now, pipes have a 4kB (as of today) "atomic" write, which is guaranteed by the Linux Kernel.
Does such a feature exists in the case of unix sockets? I couldn't find any document stating this explicitely.

Say I use a UNIX socket and I write x bytes of data from my client. Am I sure that these x bytes will be written on the server end of the socket when my server's select() cracks?

On the same subject, would using SOCK_DGRAM ensure that writes are atomic (if such a guarantee is possible), since datagrams are supposed to be single well-defined messages?
What would then be the difference using SOCK_STREAM as a transfer mode?

Thanks in advance.

Gui13
  • 12,993
  • 17
  • 57
  • 104
  • I deleted my answer since I am simply not that familiar with the AF_UNIX socket family. When I answered, I was not thinking you meant "Unix Socket == AF_UNIX". The [man page](http://linux.die.net/man/7/unix) does say that the datagrams with Unix sockets are completely reliable. So that might be an option for you. But since I have not used them, I will leave it to someone more knowledgeable to answer. – Mark Wilkins Jan 12 '11 at 15:58

2 Answers2

11

Pipes

Yes the non-blocking capacity is usually 4KB, but for maximum portability you'd probably be better off using the PIPE_BUF constant. An alternative is to use non-blocking I/O.

More information than you want to know in man 7 pipe.

Unix datagram sockets

Writes using the send family of functions on datagram sockets are indeed guaranteed to be atomic. In the case of Linux, they're reliable as well, and preserve ordering. (which makes the recent introduction of SOCK_SEQPACKET a bit confusing to me) Much information about this in man 7 unix.

The maximum datagram size is socket-dependent. It's accessed using getsockopt/setsockopt on SO_SNDBUF. On Linux systems, it ranges between 2048 and wmem_max, with a default of wmem_default. For example on my system, wmem_default = wmem_max = 112640. (you can read them from /proc/sys/net/core) Most relevant documentation about this is in man 7 socket around the SO_SNDBUF option. I recommend you read it yourself, as the capacity doubling behavior it describes can be a bit confusing at first.

Practical differences between stream and datagram

Stream sockets only work connected. This mostly means they can only communicate with one peer at a time. As streams, they're not guaranteed to preserve "message boundaries".

Datagram sockets are disconnected. They can (theoretically) communicate with multiple peers at a time. They preserve message boundaries.

[I suppose the new SOCK_SEQPACKET is in between: connected and boundary-preserving.]

On Linux, both are reliable and preserve message ordering. If you use them to transmit stream data, they tend to perform similarly. So just use the one that matches your flow, and let the kernel handle buffering for you.

Crude benchmark comparing stream, datagram, and pipes:

# unix stream 0:05.67
socat UNIX-LISTEN:u OPEN:/dev/null &
until [[ -S u ]]; do :;done
time socat OPEN:large-file UNIX-CONNECT:u

# unix datagram 0:05.12
socat UNIX-RECV:u OPEN:/dev/null &
until [[ -S u ]]; do :;done
time socat OPEN:large-file UNIX-SENDTO:u

# pipe 0:05.44
socat PIPE:p,rdonly=1 OPEN:/dev/null &
until [[ -p p ]]; do :;done
time socat OPEN:large-file PIPE:p

Nothing statistically significant here. My bottleneck is likely reading large-file.

JB.
  • 40,344
  • 12
  • 79
  • 106
  • 1
    Regarding non-blocking capacity, note that non-blocking and atomic (what the OP asked) are orthogonal things. Atomic write does not imply or cause it to be non-blocking, and a non-blocking write may be non-atomic, meaning that not all data has been written in one write() call but the call has not blocked the thread. – Maxim Egorushkin Jan 12 '11 at 20:01
  • I wish I could include some practical block size demonstration like I did in http://stackoverflow.com/questions/2552402/cat-file-vs-file/2552464#2552464 but pv doesn't work on sockets, and if I pipe through it, the bottleneck goes on the pipes. Oh well. Last paragraph stays unverified for now. – JB. Jan 12 '11 at 20:08
  • 1
    @Maxim you are right about the non-blocking/atomic distinction (though I feel we're referring to slightly different interpretations of atomicity). My memory of last diving in that part of Linux's sources was that they were more correlated than I initially thought. But as the OP didn't really tell what he was after, or what Unix flavor he used, I think I'll edit for clarity and leave it at that until he comments. – JB. Jan 12 '11 at 20:44
  • From looking at https://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html and https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html it looks like POSIX doesn't _technically_ guarantee that writes on DGRAM/SEQPACKET (unix) sockets be atomic, but any implementation that doesn't enforce atomicity for those two types is so fundamentally broken that I'm not even gonna try and do a workaround and simply panic if I detect that. – Petr Skocik Sep 17 '19 at 13:24
  • 1
    @PSkocik "A datagram must be sent in a single output operation, and must be received in a single input operation." from https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_10_06 That page does shed more light on the SEQPACKET behavior, though: non-atomic, but with safely recoverable message boundaries. Thanks for the opengroup links, they're a fine addition to this answer's context! – JB. Sep 20 '19 at 10:34
2

Say I use a UNIX socket and I write x bytes of data from my client. Am I sure that these x bytes will be written on the server end of the socket when my server's select() cracks?

If you are using AF_UNIX SOCK_STREAM socket, there is no such guarantee, that is, data written in one write/send() may require more than one read/recv() call on the receiving side.

On the same subject, would using SOCK_DGRAM ensure that writes are atomic (if such a guarantee is possible), since datagrams are supposed to be single well-defined messages?

On there other hand, AF_UNIX SOCK_DGRAM sockets are required to preserve the datagram boundaries and be reliable. You should get EMSGSIZE error if send() can not transmit the datagram atomically. Not sure what happens for write() as the man page does not say that it can report EMSGSIZE (although man pages sometimes do not list all errors returned). I would try overflowing the receiver's buffer with big sized datagrams to see which errors exactly send/write() report.

One advantage of using UNIX sockets over pipes is the bigger buffer size. I don't remember exactly what is the limit of pipe's kernel buffer, but I remember not having enough of it and not being able to increase it (it is a hardcoded kernel constant). fast_producer_process | slow_consumer_process was orders of magnitude slower than fast_producer_process > file && file > slow_consumer_process due to insufficient pipe buffer size.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271