10

I have a problem with a boost::asio::ip::tcp::iostream. I am trying to send about 20 raw bytes. The problem is that this 20 byte payload is split into two TCP packets with 1 byte, then 19 bytes. Simple problem, why it is happening I have no idea. I am writing this for a legacy binary protocol that very much requires the payload to fit in a single TCP packet (groan).

Pasting the whole source from my program would be long and overly complex, I've posted the functional issue just within 2 functions here (tested, it does reproduce the issue);

#include <iostream>

// BEGIN cygwin nastyness
// The following macros and conditions are to address a Boost compile
// issue on cygwin. https://svn.boost.org/trac/boost/ticket/4816
//
/// 1st issue
#include <boost/asio/detail/pipe_select_interrupter.hpp>

/// 2nd issue
#ifdef __CYGWIN__
#include <termios.h>
#ifdef cfgetospeed
#define __cfgetospeed__impl(tp) cfgetospeed(tp)
#undef cfgetospeed
inline speed_t cfgetospeed(const struct termios *tp)
{
    return __cfgetospeed__impl(tp);
}
#undef __cfgetospeed__impl
#endif /// cfgetospeed is a macro

/// 3rd issue
#undef __CYGWIN__
#include <boost/asio/detail/buffer_sequence_adapter.hpp>
#define __CYGWIN__
#endif
// END cygwin nastyness.

#include <boost/array.hpp>
#include <boost/asio.hpp>
#include <iostream>

typedef boost::asio::ip::tcp::iostream networkStream;

void writeTestingData(networkStream* out) {
        *out << "Hello world." << std::flush;
//      *out << (char) 0x1 << (char) 0x2 << (char) 0x3 << std::flush;
}

int main() {
        networkStream out("192.168.1.1", "502");

        assert(out.good());

        writeTestingData(&out);
        out.close();
}

To add to the strange issue, if I send the string "Hello world.", it goes in one packet. If I send 0x1, 0x2, 0x3 (the raw byte values), I get 0x1 in packet 1, then the rest of the data in the next TCP packet. I am using wireshark to look at the packets, there is only a switch between the dev machine and 192.168.1.1.

Sam Miller
  • 23,808
  • 4
  • 67
  • 87
xconspirisist
  • 1,451
  • 2
  • 13
  • 26
  • Can you try doing `*out << "\001\002\003" << std::flush;` instead of multiple stream calls? – Nikolai Fetissov Jul 27 '11 at 15:52
  • Yes indeed, using "\001..." works fine, ie: the bytes are all in the same packet and sent as you would expect, but unfortunetly this is too awkward to be a solution, as realistically the 0x1, 0x2, etc that I am sending are variables (1 byte, be that char or uint8_t, I dont care) and putting them in a string is not as nice as using steam operators (<<). Thanks so far though ;) – xconspirisist Jul 27 '11 at 16:21
  • 1
    How about using buffered stream (http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/reference/buffered_write_stream.html)? Another point - how does that legacy binary know if it's one or multiple packets - TCP is a stream - are you confusing packets with number of system calls here? – Nikolai Fetissov Jul 27 '11 at 16:38
  • 1
    @Nikolai I don't think the OP is confusing anything based on their other comments. But that is very weird...the legacy protocol must be looking at raw sockets data or something. – Gravity Aug 13 '11 at 05:35

5 Answers5

10

Your code:

out << (char) 0x1 << (char) 0x2 << (char) 0x3;

Will make 3 calls of operator<< function.

Because of Nagle's algorithm of TCP, TCP stack will send available data ((char)0x1) to peer immediately after/during the first operator<< call. So the rest of the data (0x2 and 0x3) will go to the next packet.

Solution for avoiding 1 byte TCP segments: Call sending functions with bigger bunch of data.

SKi
  • 8,007
  • 2
  • 26
  • 57
  • Thank you very much indeed for your comment, I accept this as the answer and award you the bonus points. The keyword was "Nagles algorithm" that exactly describes the problem. I overcame the issue with your suggestion, instead using a stringstream to store the packet then sending it all at once. Thank you again. – xconspirisist Aug 15 '11 at 11:56
10

Don't worry, you are from from the only one to have this problem. There is definitely a solution. In fact, you have TWO problems with your legacy protocol and not only one.

Your old legacy protocol requires one "application message" to fit in "one and only one TCP packet" (because it incorrectly use a TCP stream-oriented protocol as a packet-oriented protocol). So we must make sure that :

  1. no "application message" is split across multiple TCP packets (the problem you are seeing)
  2. no TCP packet contains more than one "application message" (you are not seeing this but it may definitely happen)

The solution :

problem 1

You must feed your socket with all your "message" data at once. This is currently not happening because, as other people have outlined it, the boost stream API you use put data into the socket in separated calls when you use successive "<<" and the underlying TCP/IP stack of your OS doesn't buffer it enough (and with reasons, for better performance)

Multiple solutions :

  • you pass a char buffer instead of separate chars so that you make only one call to <<
  • you forget about boost, open an OS socket and feed it in one call to send() (on windows, look for the "winsock2" API, or look for "sys/socket.h" on unix/cygwin)

problem 2

You MUST activate the TCP_NODELAY option on your socket. This option is especially made for such legacy protocol cases. It will ensure that the OS TCP/IP stack send your data "without delay" and doesn't buffer it together with another application message you may send later.

  • if you stick with Boost, look for the TCP_NODELAY option, it is in the doc
  • if you use OS sockets, you'll have to use the setsockopt() function on your socket.

Conclusion

If you solve those two problems, you should be fine !

The OS socket API, either on windows or linux, is a bit tricky to use, but you'll gain full control about its behaviour. Unix example

Offirmo
  • 18,962
  • 12
  • 76
  • 97
  • Thank you very much for your comprehensive description. I would have liked to award your the points but I'm afraid that "User1" more accurately described the problem (Nagles algorithm) and suggested the resolution. Upvote for your efforts though, thank you. – xconspirisist Aug 15 '11 at 11:55
1

I am not sure who would have imposed such a thing as having a requirement that an entire payload be within one TCP packet. TCP by its nature is a streamed protocol and much of the details in number of packets sent and payload size etc. are left up to the TCP stack implementation of the operating system.

I would double check to see if this is an actual restriction of your protocol or not.

feathj
  • 3,019
  • 2
  • 22
  • 22
  • 3
    There are many implementations that (stupidly, erroneously, idiotically, but there you have it) rely on specific TCP stack behavior. This is particularly widespread in embedded device software. – Mihai Limbășan Jul 27 '11 at 18:32
  • Mihai, thank you for reasserting my point. Indeed this is the case, the embedded devices do have an exceedingly crude and old implementation of the TCP stack, despite the comments of others, I cannot just re-code it, and I do understand what a packet is! – xconspirisist Jul 28 '11 at 07:05
1

I agree with User1's answer. You probably invoke operator << several times; on the first invocation it immediately sends the first byte over the network, then the Nagle's algorithm comes into play, hence the remaining data is sent within a single packet.

Nevertheless, even if the packetization was not an issue, the even fact that you invoke a socket sending function frequently on small pieces of data is a big problem. Every function called on a socket invokes a heavy kernel-mode transaction (system call), calling send for every byte is simply insane!

You should first format your message in the memory, and then send it. For your design I'd suggest creating a sort of a cache stream, that would accumulate the data in its internal buffer and send it at once to the underlying stream.

valdo
  • 12,632
  • 2
  • 37
  • 67
0

It is erroneous to think of data sent over a TCP socket as packets. It is a stream of bytes, how you frame the data is application specific.

Any suggestions?

I suggest you implement a protocol such that the receiver knows how many bytes to expect. One popular way to accomplish this is to send a fixed size header indicating the number of bytes for the payload.

Sam Miller
  • 23,808
  • 4
  • 67
  • 87
  • 3
    While true, I don't think this answers the question. this should have been a comment. – Hasturkun Jul 27 '11 at 16:31
  • @Hasturkun I've added a more fully baked answer. – Sam Miller Jul 27 '11 at 16:33
  • 2
    As stated in my original question, I am unable to alter the protocol, it is very legacy and while I fully appreciate it should not matter if it arrives in 1, 10, or 999 packets, this protocol and my application most certainly requires the payload to fit within, and be transmitted within 1 TCP packet. The payload is 20 bytes, so this should not be a problem. I am aware of the distinction between packets and streams, I used the terms as they were the best construct I had for asking this question. – xconspirisist Jul 28 '11 at 07:01