I have a server written in C which is blocked at function accept()
and awaits new incoming connections. When a new connection is accepted, it creates a new process by calling fork()
. I don't use epoll
as each client socket is handled by a independent process, and one of the libraries it uses crashes in multi-thread environment.
Here is the code of server:
srv_sock = init_unix_socket();
listen(srv_sock, 5);
/* Other code which handles SIGCLD. */
while (1) {
log_info("Awaiting new incoming connection.");
clt_sock = accept(srv_sock, NULL, NULL);
if (clt_sock < 0) {
log_err("Error ...");
continue;
}
log_info("Connection %d accepted.", clt_sock);
cld_pid = fork();
if (cld_pid < 0) {
log_err("Failed to create new process.");
close(clt_sock);
continue;
}
if (clt_pid == 0) {
/* Initialize libraries. */
/* Handle client connection ... */
shutdown(clt_sock, SHUT_RDWR);
close(clt_sock);
_exit(0);
}
else {
log_info("Child process created for socket %d.", clt_sock);
close(clt_sock);
}
}
The client is written in Java, it connects to the server by using the library junixsocket
since Java doesn't support Unix domain socket. When it is connected with the server, it send a request (a header + XML document) and waits for reply from server.
Here is the code of client:
File socketFile = new File(UNIX_SOCKET_PATH);
AFUNIXSocket socket = AFUNIXSocket.newInstance();
socket.connect(new AFUNIXSocketAddress(socketFile));
InputStream sis = socket.getInputStream();
OutputStream sos = socket.getOutputStream();
logger.info("Connected with server.");
byte[] requestHeader;
byte[] requestBuffer;
sos.write(requestHeader, 0, requestHeader.length);
logger.info("Header sent.");
sos.write(requestBuffer, 0, requestBuffer.length);
logger.info("Request XML sent.");
sos.flush();
Now the problem is when I have 3 client threads which connect to server at the same time. I always have 1 task running while the other 2 keep waiting until the first one is finished.
I have checked the logs. All the 3 client threads have connected and sent request to server at (almost) the same time, but the server has only accepted the first one arrived, and delayed the 2 others. According to logs, there is a delay of 3 minutes between connect
on client side and accept
on server side.
At first I thought the delay might be caused by some sort of buffer, so I call OutputStream.flush()
after each OutputStream.write
call, but the problem persists.
I cannot figure out what might cause this delay, any idea please ?
Thank you.
Update Mar 15 2016
pstack
shows that the parent process was blocked at waitpid
in my SIGCHLD
handler. This was problably why the accept
didn't return when new incoming connection arrived as the execution procedure was interrupted by the signal handler.
Here is the code of my signal handler:
static void _zombie_reaper (int signum) {
int status;
pid_t child;
if (signum != SIGCHLD) {
return;
}
while ((child = waitpid(-1, &status, WNOHANG)) != -1) {
continue;
}
}
/* In main function */
struct sigaction sig_act;
memset(&sig_act, 0, sizeof(struct sigaction));
sigemptyset(&sig_act.sa_mask);
sig_act.sa_flags = SA_NOCLDSTOP;
sig_act.sa_handler = _zombie_reaper;
if (sigaction(SIGCHLD, &sig_act, NULL) < 0) {
log_err("Failed to register signal handler.");
}