I posted a similar question yesterday but I did a poor job of outlining my problem and since then I think I have made progress.
My minimal working example is still quite long so I will post relevant snippets but the full example can be found here.
My problem is quite simple, I have two POSIX message queues that are created to be asynchronous and are both handled by the same handler on the same thread. My problem is at a more fundamental level in that if a separate thread sends to both queues sequentially then the sig handler is only run once for the first queue. This makes sense given that when a signal invokes a handler it is automatically blocked, according to GNU.
As such when I am configuring my struct sigaction
I made sure to remove the target signal (SIGIO) from the sigset_t
that I set as sa_mask
. My assumption was that then using SA_NODEFER
, as explained in sigaction(2), the signal handler would be able to be called recursively (not sure if recursively is the right word here).
sa_mask specifies a mask of signals which should be blocked (i.e., added to the signal mask of the thread in which the signal handler is invoked) during execution of the signal handler. In addition, the signal which triggered the handler will be blocked, unless the SA_NODEFER flag is used.
The relevant code for attaching the signal handler to the message queue
assert((conn->fd = mq_open(conn->name, O_CREAT | O_RDONLY | O_NONBLOCK,
0644, &attr)));
/** Setup handler for SIGIO */
/** sigaction(2) specifies that the triggering signal is blocked in the handler */
/** unless SA_NODEFER is specified */
sa.sa_flags = SA_SIGINFO | SA_RESTART | SA_NODEFER;
sa.sa_sigaction = sigHandler;
/** sa_mask specifies signals that will be blocked in the thread the signal */
/** handler executes in */
sigfillset(&sa.sa_mask);
sigdelset(&sa.sa_mask, SIGIO);
if (sigaction(SIGIO, &sa, NULL)) {
printf("Sigaction failed\n");
goto error;
}
printf("Handler set in PID: %d for TID: %d\n", getpid(), gettid());
/** fcntl(2) - FN_SETOWN_EX is used to target SIGIO and SIGURG signals to a */
/** particular thread */
struct f_owner_ex cur_tid = { .type = F_OWNER_TID, .pid = gettid() };
assert(-1 != fcntl(conn->fd, F_SETOWN_EX, &cur_tid));
As a sanity check I checked the signal mask inside the handler to check if SIGIO was blocked.
void sigHandler(int signal, siginfo_t *info, void *context)
{
sigset_t sigs;
sigemptyset(&sigs);
pthread_sigmask(0, NULL, &sigs);
if (sigismember(&sigs, SIGIO)) {
printf("SIGIO being blocked in handler\n");
sigaddset(&sigs, SIGIO);
pthread_sigmask(SIG_UNBLOCK, &sigs, NULL);
}
...
}
But SIGIO appear to not be blocked. My reasoning tells me that the following should happen given the two message queues MQ1 and MQ2 who async use the same handler both on SIGIO. Given the timing of the two threads and the latency of the signals is hard for me to really know. Better said my some-what educated guess would be:
mq_send
to MQ1 directly followed bymq_send
to MQ2 from thread 1- MQ1's signal handler should fire given the SIGIO from MQ1 on thread 2
- MQ2's signal handler would interrupt MQ1's signal handler on thread 2
- MQ2's signal handler completes on thread 2
- MQ1's signal handler completes on thread 2
Running the example I linked earlier the following bahaviour is observed
mq_send
to MQ1 directly followed bymq_send
to MQ2 from thread 1- MQ1's signal handler fires and completes
Which makes me think that somehow SIGIO is being blocked or ignored during the signal handler. Given what I have read of sa_mask
and my sanity check using pthread_sigmask
I am not sure why I am getting the behavior I am seeing. I am hoping I have missed some little nugget of knowledge somewhere in the manpages.