1

Had a few socket questions, and could not find a definite yes or no, so apologies in advance if it is a repost:) Platform is Linux 2.6.30, C++ app. Still very new to networking, coming up to speed.

  1. Is the socket API thread safe? For example, if I did a send from multiple threads without using a mutex or do I have to make sure of that by using my own mutexes?

  2. Is it better to poll/select to check whether my send will block and then do a send, versus just doing a send and letting the send API internal queuing take care of the sending? If the threads are going to block anyways( if I dont use a timeout, that is), I dont really see why a poll followed by a send is required.

  3. Are sockets zero copy by default in Linux, or is there a copy involved? Is there a size restriction if there is a copy(not in terms of the API, but in terms of the granularity)? Are there zero copy sockets if the answer is that the kernel does do a copy?

  4. If I had to communicate between two machines, I would assume multiple sockets would utilize the bandwidth much better than a single socket. Would that a correct assumption? Whats the best way to utilize full BW between two regular Linux machines?

5.Whats your favorite tool for measuring current bandwidth usage on interfaces? This is probably just more of a preference, I looked at iptraf etc, but would like to see what others use and like most.

Mark Lobo
  • 331
  • 2
  • 3
  • 9

2 Answers2

3
  1. The socket API is definitely thread safe on a per-socket basis... i.e. as long as any given socket is accessed by only one thread, other threads can be accessing other sockets at the same time without any problems. Having multiple threads accessing the same socket at the same time may or may not be "thread safe" (for some definitions of thread safe) but in any case it is not recommended, since the resulting behavior is not easily predictable. (e.g. two threads call recv() on the same socket at around the same time, which thread will get which bytes from the incoming TCP stream? God only knows)

  2. If you are using blocking I/O, then checking with poll/select() before calling send() isn't terribly useful, since even if select() indicates you have space in the outgoing buffer, you might block inside send() anyway. (e.g. if there are 32 bytes of space in the buffer, and then you try to send 64 bytes, send() will block). select() and poll() are more useful in conjunction with non-blocking I/O.

  3. I'm not really sure about the state of zero-copy in Linux sockets, but the Wikipedia article seems to suggest that Linux only does zero-copy if you are using the sendfile(), sendfile64(), or splice() calls. Given a modern CPU I doubt it matters all that much anyway -- all but the most high-performance programs are going to have plenty of CPU cycles available to copy network data; it's the speed of the network interface itself that will be the bottleneck.

  4. Multiple sockets will not give you better bandwidth utilization than a single socket; a single socket can easily saturate a network connection if you pump data across it quickly enough. Multiple TCP connections in particular will generally give you worse bandwidth utilization than a single TCP connection, since each TCP connection incurs its own transmission overhead, and worse, the multiple TCP connections will compete for bandwidth and at some point start causing each other to drop packets, which they will respond to by reducing their send rate in order to reduce congestion.

  5. I can't answer this for Linux; under MacOS/X Activity Monitor will show network bandwidth usage.

Jeremy Friesner
  • 70,199
  • 15
  • 131
  • 234
1

Jeremy's point #3 is somewhat inaccurate. With multiple high speed network interfaces and a lot of data moving around, zero-copy becomes significantly more important. The time spent copying buffers around in memory impacts throughput and latency. Asynchronous IO and scatter/gather support are critical for high speed, networked systems.

Ross Judson
  • 1,132
  • 5
  • 11