Previously I had asked a question regarding how to terminate thread blocked for I/O. I have used pthread_kill()
instead of pthread_cancel()
or writing to pipes, considering few advantages. After implementing the code to send signal (SIGUSR2) to the target thread using pthread_kill()
, initially it worked fine and didn't face any issues.
static void signalHandler(int signum) {
LogInfo("SIGNAL HANDLER : %d",signum); //print info message using logger utility
}
void setSignalHandler() {
struct sigaction actions;
memset(&actions, 0, sizeof(actions));
sigemptyset(&actions.sa_mask);
actions.sa_flags = 0;
actions.sa_handler = signalHandler;
int rc = sigaction(SIGUSR2,&actions,NULL);
if(rc < 0) {
LogError("Error in sigaction"); //print error using logger utility
}
}
void stopThread() {
fStop = 1; //set the flag to stop the thread
if( ftid ) { //ftid is pthread_t variable of thread that needs to be terminated.
LogInfo("Before pthread_kill()");
int kill_result = pthread_kill(ftid,SIGUSR2); // send signal to interrupt the thread if blocked for poll() I/O
LogInfo("pthread_kill() returned : %d ( %s )",kill_result,strerror(kill_result));
int result = pthread_join( ftid, NULL );
LogInfo("Event loop stopped for %s", fStreamName);
LogInfo( "pthread_join() -> %d ( %s )\n", result, strerror(result) );
if( result == 0 ) ftid = 0;
}
}
Recently I have noticed a deadlock issue where few threads (which tried to log statements) blocked with below stacktrace
#0 0x00007fc1b2dc565c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007fc1b2d70f0c in _L_lock_2462 () from /lib64/libc.so.6
#2 0x00007fc1b2d70d47 in __tz_convert () from /lib64/libc.so.6
#3 0x00000000004708f7 in Logger::getCurrentTimestamp(char*) ()
getCurrentTimestamp(char*)
function is called from our logger module when LogInfo()
or LogError()
functions are invoked. This function internally calls gettimeofday() to print the current time in logs. So I'm suspecting LogInfo()
function (which calls gettimeofday()
) called inside signalHandler() might be causing the deadlock issue. But i'm not confident of this as this issue is not getting reproduced frequently and i did not get any information stating that gettimeofday()
is not async signal safe.
gettimeofday() is thread safe and i'm considering it's not async signal safe as it was not listed under async signal safe functions.
1) Can I consider gettimeofday()
is not async signal safe and it was the root cause of deadlock ?
2) Are there any known issues where use of pthread_kill()
might cause deadlock ?
EDIT :
Below is the definition of getCurrentTimeStamp()
function
char* Logger::getCurrentTimestamp ( char *tm_buff ) {
char curr_time[ 32 ] = { 0 };
time_t current = time( NULL );
struct timeval detail_time;
if ( tm_buff ) {
strftime( curr_time, sizeof(curr_time), "%Y-%m-%d-%H:%M:%S", localtime( ¤t ) );
gettimeofday( &detail_time, NULL );
sprintf( tm_buff, "%s:%03d", curr_time, (int) (detail_time.tv_usec / 1000) );
return (tm_buff);
}
return (NULL);
}
stack trace of the thread that was interrupted by signal.
#0 0x00007fc1b2dc565c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007fc1b2d70f0c in _L_lock_2462 () from /lib64/libc.so.6
#2 0x00007fc1b2d70d47 in __tz_convert () from /lib64/libc.so.6
#3 0x00000000004708f7 in Logger::getCurrentTimestamp(char*) ()
#4 0x000000000046e80f in Log::write(Logger*, LogType, std::string const&, char const*) ()
#5 0x000000000046df74 in Log::i(Logger*, char const*, std::string const&, std::string const&, char const*, ...) ()
#6 0x0000000000456b67 in ?? ()
#7 <signal handler called>
#8 0x00007fc1b2da8dc5 in _xstat () from /lib64/libc.so.6
#9 0x00007fc1b2d71056 in __tzfile_read () from /lib64/libc.so.6
#10 0x00007fc1b2d703b4 in tzset_internal () from /lib64/libc.so.6
#11 0x00007fc1b2d70d63 in __tz_convert () from /lib64/libc.so.6
#12 0x00000000004708f7 in Logger::getCurrentTimestamp(char*) ()
#13 0x000000000047030e in Logger::writeLog(int, std::string&, std::string&) ()
#14 0x0000000000470633 in Logger::Write(Logger*, int, std::string, std::string const&) ()
#15 0x000000000046eb02 in Log::write(Logger*, LogType, std::string const&, char const*) ()
#16 0x000000000046df74 in Log::i(Logger*, char const*, std::string const&, std::string const&, char const*, ...) ()