3

I'm writing a backend for a web-application in C++ (with boost-beast), and the front-end will probably use socket.io. So this question applies to both implementation and whether there's something in the websocket standard that answers my question.

I'm not sure what precautions to take to guarantee the completeness of a message. Say the client sends a message that is 100 bytes long, and boost::beast reads the message with async_read to a multi_buffer. Am I guaranteed to receive the whole 100 bytes? Probably. But what if the message is 1 MB?

Why do I think that this question is important? Because this determines how simple my communication protocol is going to be. If only complete messages are to be sent and received, then I don't have to implement a middle-ware protocol with a header that determines the size of the message (which is necessary with TCP in general, but not necessary in some messaging libraries like ZeroMQ). However, if there's no guarantee that messages are complete on arrival, then I should implement a protocol to get the message size. Something like (simplest possible): 6 bytes that contain the message size + the message. Then I read this as a FIFO queue to process the size of the message then read the message.

Am I approaching websocket the wrong way? Please advise.

The Quantum Physicist
  • 24,987
  • 19
  • 103
  • 189

1 Answers1

3

Yes, the question is important.

Luckily, the answer is elementary: websocket is not a stream based protocol like TCP, it's message based.

The RFC includes the following diagram

+-+-+-+-+-------+-+-------------+-------------------------------+
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
|     Extended payload length continued, if payload len == 127  |
+ - - - - - - - - - - - - - - - +-------------------------------+
|                               | Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued)       |          Payload Data         |
+-------------------------------- - - - - - - - - - - - - - - - +
:                     Payload Data continued ...                :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
|                     Payload Data continued ...                |
+---------------------------------------------------------------+

So framing is part of the websocket protocol. I think this looks like an excellent backgrounder if you want to understand the details of it: http://lucumr.pocoo.org/2012/9/24/websockets-101/

However, in practice you'd use a higher-level Websockets library and just use that.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thanks for the answer and the link. I have a follow-up question on the matter. Does this mean that I should strictly use `async_read` + a `multi_buffer`, and not any variation that may read parts of the message (like `async_read_some`)? In other words, boost::asio is flexible to allow the dev to read parts of the message. How can one check if the message is complete before emptying the buffer? – The Quantum Physicist May 29 '18 at 19:23
  • I'd suggest that indeed. I don't know what the goal would be to interfere with a web-socket implementation (such as [Boost Beast's](https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/examples.html)) – sehe May 29 '18 at 19:54
  • Actually beast has [`async_read_some`](https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/ref/boost__beast__http__async_read_some.html) (not just through asio)... – The Quantum Physicist May 29 '18 at 20:13
  • @TheQuantumPhysicist The linked functions works in coordination with a Parser instance, and as such may be considered part of the interface documentation for the Parser concept. Regardless, there is indeed a `read_some`/`write_some` interfaces for websockets as well, that you can coordinate for "completeness" (passing a parameter or using `is_message_done()`). This can come in handy for very large messages or streaming operations. Keep in mind that regardless of your choice, WebSockets will still add framing surrounding your message. – sehe May 29 '18 at 22:19
  • See also [Frames](https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/using_websocket/send_and_receive_messages.html#beast.using_websocket.send_and_receive_messages.frames) that gives a rationale for why you'd need those functions (although the example does still read the entire message before echoeing) – sehe May 29 '18 at 22:19