29

What are "message boundaries" in the following context?

One difference between TCP and UDP is that UDP preserves message boundaries.

I understand the difference between TCP and UDP, but I'm unsure about the definition of "message boundaries".

Since UDP includes the destination and port information in each individual packet, could it be this that gives message a "boundary"?

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
KMC
  • 19,548
  • 58
  • 164
  • 253
  • 1
    TCP is a reliable ordered byte-stream. UDP is an unreliable unordered stream of packets. Strictly speaking, the question here is wrong. You confuse application-level messages with protocol-level messages. UDP most certainly does *NOT* preserve application-level messages boundaries, if larger than the UDP message allowed by transport (possibly as small as ~~1500 bytes). – Preston L. Bannister Jul 03 '22 at 19:17
  • To understand how application protocols (like HTTP) implement boundaries when working on top of TCP (which is byte stream, has no boundaries), see this answer - [link](https://superuser.com/questions/1430814/how-does-tcp-handle-multiple-requests-targeted-to-one-port) – Vijay Chavda Feb 17 '23 at 08:28

3 Answers3

49

No, message boundaries have nothing to do with destinations or ports. A "message boundary" is the separation between two messages being sent over a protocol. UDP preserves message boundaries. If you send "FOO" and then "BAR" over UDP, the other end will receive two datagrams, one containing "FOO" and the other containing "BAR".

If you send "FOO" and then "BAR" over TCP, no message boundary is preserved. The other end might get "FOO" and then "BAR". Or it might get "FOOBAR". Or it might get "F" and then "OOB" and then "AR". TCP does not make any attempt to preserve application message boundaries -- it's just a stream of bytes in each direction.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • In TCP, would the stream / buffer transmit or receive in order? I wouldn't not get "BAR" before "FOO" or each byte may mixed up to something like "ORAFBO"? – KMC Mar 05 '12 at 08:41
  • 2
    @KMC: [Wikipedia TCP](http://en.wikipedia.org/wiki/Transmission_Control_Protocol): TCP provides reliable, **ordered** delivery of a stream of bytes [..] – Jonas Bötel Mar 05 '12 at 08:44
  • 2
    @KMC It may or may not get transmitted in order on the wire (in practice it will be, but no law requires it to be), however, it will be presented in order to the receiving application. – David Schwartz Mar 05 '12 at 08:51
  • on the receiver side, how does it ensure that it has completely read the content of a request message, potentially across multiple calls of recv, without mixing the data sent from the next request. I think for HTTP the "Content-Length" can be used as a hint about how many bytes are left once all headers are read. For other application protocols built on top of TCP, like DNS or SMTP, are they using similar idea? – torez233 Jun 18 '21 at 04:33
  • @hafan96 That's the job of the protocol implementation in the sender and receiver. They can use length-preceded messages, they can use boundary characters (like each message on its own line), or other mechanisms. – David Schwartz Jun 21 '21 at 17:15
  • This answer is *entirely* wrong. If you send "FOO" then "BAR" over a TCP connection, then you *always* get "FOOBAR" on the other end. The order is always preserved. Perhaps you are confused by buffering? If the message(s) are large enough, you might need to recv() or read() more than once to get the entire message. But order is always preserved - without exception. – Preston L. Bannister Dec 14 '22 at 15:52
  • @PrestonL.Bannister This is precisely what the second paragraph of this answer says. – David Schwartz Dec 14 '22 at 18:57
3

Message boundaries in this context is simply the start & end of the message/packet. With TCP connections, all messages/packets are combined into a continuous stream of data, whereas with UDP the messages are given to you in their original form. They will have an exact size in bytes.

aidanok
  • 859
  • 1
  • 7
  • 11
0

You should be careful to not confuse application-level and protocol-level message/packet boundaries. They are very different things. The question does not clearly distinguish between very different concepts.

As a cheat, on an isolated subnet, with small messages, when reliable and ordered delivery is not required - you can cheat and use UDP. A single UDP send/receive will always contain one message - which is simpler to code.

When reliable and ordered delivery is required, or larger application-level messages are needed - then you want to use TCP. Yes, a small bit of discipline is required of your application. You need to properly serialize/deserialize your application-level messages through the TCP stream. This is easily done.

The "answer" is that application-level message boundaries must be assured by the application. UDP can serve for some applications. TCP is better for others (and a much larger set).

Also, if you have multiple threads doing unmanaged writes to a single stream (network or file) then that is a problem in your application.