I have a client server program written in C. The intent is to see how fast big data can be trasported over TCP. The receiving side OS (Ubuntu Linux 14.*) is tuned to improve the TCP performance, as per the documentation around tcp / socket / windows scaling etc. as below:
net.ipv4.tcp_window_scaling = 1
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 16384 16777216
This aprt, I have also increased the individual socket buffer size through setsockopt call.
But I am not seeing the program responding to these changes - the overall throughput is either flat or even reduced at times. When I took tcpdump at the receiving side, I see a monotonic pattern of tcp packets of length 1368 as coming to it, in most (99%) cases.
19:26:06.531968 IP <SRC> > <DEST>: Flags [.], seq 25993:27361, ack 63, win 57, options [nop,nop,TS val 196975830 ecr 488095483], length 1368
As per the documentation, the tcp window scaling option increases the receiving frame size in propotion to the demand and capacity - but all I see is "win 57" - very few bytes remaining in the receiving buffer, which is not matching with the expection.
Hence I start suspecting my assumptions on the tuning itself, and have these questions:
Is there any specific tunables required at the sending side to improve the client side reception? Making sure that you (program) writes the whole chunk of data in one go is not enough?
In the client side tunable as mentioned above necessary and sufficient? The default on in the system are too low, but I don't see the changes applied in /etc/sysctl.conf having any effect. Is running sysctl --system after changes sufficient to make the changes in effect? or do we need to reboot the system?
If the OS is a virtual machine, will these tunables make meaning in its completeness, or are there additional steps at the real physical machine?
I can share the source code if that helps, but I can guarentee that it is just a trivial code.
Here is the code:
#cat client.c
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <string.h>
#define size 1024 * 1024 * 32
int main(){
int s;
char buffer[size];
struct sockaddr_in sa;
socklen_t addr_size;
s = socket(PF_INET, SOCK_STREAM, 0);
sa.sin_family = AF_INET;
sa.sin_port = htons(25000);
sa.sin_addr.s_addr = inet_addr("<SERVERIP");
memset(sa.sin_zero, '\0', sizeof sa.sin_zero);
addr_size = sizeof sa;
connect(s, (struct sockaddr *) &sa, addr_size);
int rbl = 1048576;
int g = setsockopt(s, SOL_SOCKET, SO_RCVBUF, &rbl, sizeof(rbl));
while(1) {
int ret = read(s, buffer, size);
if(ret <= 0) break;
}
return 0;
}
And the server code:
bash-4.1$ cat server.c
#include <sys/types.h>
#include <sys/mman.h>
#include <memory.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <netinet/in.h>
#include <errno.h>
#include <stdio.h>
#include <sys/socket.h>
extern int errno;
#define size 32 * 1024 * 1024
int main() {
int fdsocket;
struct sockaddr_in sock;
fdsocket = socket(AF_INET,SOCK_STREAM, 0);
int rbl = 1048576;
int g = setsockopt(fdsocket, SOL_SOCKET, SO_SNDBUF, &rbl, sizeof(rbl));
sock.sin_family = AF_INET;
sock.sin_addr.s_addr = inet_addr("<SERVERIP");
sock.sin_port = htons(25000);
memset(sock.sin_zero, '\0', sizeof sock.sin_zero);
g = bind(fdsocket, (struct sockaddr *) &sock, sizeof(sock));
if(g == -1) {
fprintf(stderr, "bind error: %d\n", errno);
exit(1);
}
int p = listen(fdsocket, 1);
char *buffer = (char *) mmap(NULL, size, PROT_WRITE|PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if(buffer == -1) {
fprintf(stderr, "%d\n", errno);
exit(-1);
}
memset(buffer, 0xc, size);
int connfd = accept(fdsocket, (struct sockaddr*)NULL, NULL);
rbl = 1048576;
g = setsockopt(connfd, SOL_SOCKET, SO_SNDBUF, &rbl, sizeof(rbl));
int wr = write(connfd, buffer, size);
close(connfd);
}