I have the following question: can I use a signal handler for SIGCHLD and at specific places use waitpid(3) instead?
Here is my scenario: I start a daemon process that listens on a socket (at this point it's irrelevant if it's a TCP or a UNIX socket). Each time a client connects, the daemon forks a child to handle the request and the parent process keeps on accepting incoming connections. The child handling the request needs at some point to execute a command on the server; let's assume in our example that it needs to perform a copy like this:
cp -a /src/folder /dst/folder
In order to do so, the clild forks a new process that uses execl(3) (or execve(3), etc.) to execute the copy command.
In order to control my code better, I would ideally wish to catch the exit status of the child executing the copy with waitpid(3). Moreover, since my daemon process is forking children to handle requests, I need to have a signal handler for SIGCHLD so as to prevent zombie processes from being created.
In my code, I setup a signal handler for SIGCHLD using signal(3), I daemonize my program by forking twice, then I listen on my socket for incoming connections, I fork a process to handle each coming request and my child-process forks a grand-child-process to perform the copy, trying to catch its exit status via waitpid(3).
What happens is that SIGCHLD is caught by my handler when a grand-child-process dies, before waitpid(3) takes action and waitpid(3) returns -1 even though the grand-child-process exits with success.
My first thought was to add:
signal(SIGCHLD, SIG_DFL);
just before forking the child process to handle my connecting clients, without any success. Using SIG_IGN didn't work either.
Is there a suggestion on how to make my scenario work?
Thank you all for your help in advance!
PS. If you need code, I'll post it, but due to its size I decided to do so only if necessary.
PS2. My intention is to use my code in FreeBSD, but my checks are performed in Linux.
EDIT [SOLVED]:
The problem I was facing is solved. The "unexpected" behaviour was caused by my waitpid(3) handling code which was buggy at some point.
Hence, the above method can indeed be used to allow for signal(3) and waitpid(3) coexistence in daemon-like programs.
Thanx for your help and I hope that this method helps someone wishing to accomplish such a thing!