21

I have built a C++ library using boost ASIO. The library needs to be both thread-safe and fork-safe. It has service scheduler thread, which calls io_service::run(). To support fork-safety, I've registered pre_fork, post_fork_parent and post_fork_child handlers. The pre_fork() handler, calls _io_service.notify_fork(boost::io_service:fork_prepare(), post_fork_parent handler calls _io_service.notify_fork(boost::asio::io_service::fork_parent) and post_fork_child calls _io_service.notify_fork(boost::asio::io_service::fork_child).

The problem I'm facing in, when the fork() happens, the service scheduler thread might be in middle of some operation and might have acquired lock on data members of io_service object. So, the child process sees them in the same state and in the post_fork_child() when we call _io_service.notify_fork(boost::asio::io_service::fork_child) it tries to acquire the lock on the same object and hence gets blocked indefinitely (as there is no thread in child to release the unlock).

The stack trace I see in the child process, which is blocked, is -

fffffd7ffed07577 lwp_park (0, 0, 0) 
fffffd7ffecffc18 mutex_lock_internal () + 378 
fffffd7ffecfffb2 mutex_lock_impl () + 112 
fffffd7ffed0007b mutex_lock () + b 
fffffd7fff26419d __1cFboostEasioGdetailLscoped_lock4n0CLposix_mutex__2t5B6Mrn0D__v_ () + 1d 
fffffd7fff2866a2 __1cFboostEasioGdetailQdev_poll_reactorMfork_service6Mn0BKio_serviceKfork_event__v_ () + 32 
fffffd7fff278527 __1cFboostEasioGdetailQservice_registryLnotify_fork6Mn0BKio_serviceKfork_event__v_ () + 107 
fffffd7fff27531c __1cDdesGtunnelQServiceSchedulerPpost_fork_child6M_v_ () + 1c 
fffffd7fff29de24 post_fork_child () + 84 
fffffd7ffec92188 _postfork_child_handler () + 38 
fffffd7ffecf917d fork () + 12d 
fffffd7ffec172d5 fork () + 45 
fffffd7ffef94309 fork () + 9 
000000000043299d main () + 67d 
0000000000424b2c ???????? () 

Apparently the "dev_poll_reactor" is locked (because it seems to be dispatching some pending events) in the service scheduler thread when the fork has happened which is causing the problem.

I think to solve the problem, I need to ensure that service scheduler thread is not in the middle of any processing when the fork happens and one way to guarantee that would be to call io_service.stop() in pre_fork() handler but that doesn't sound like a good solution. Could you please let me know what is the right approach to make the library fork safe?

The code snippets looks something like this.

/** 
 * Combines Boost.ASIO with a thread for scheduling. 
 */ 
class ServiceScheduler : private boost::noncopyable 
{ 
public : 
    /// The actual thread used to perform work. 
    boost::shared_ptr<boost::thread>             _service_thread; 

    /// Service used to manage async I/O events 
    boost::asio::io_service                      _io_service; 

    /// Work object to block the ioservice thread. 
    std::auto_ptr<boost::asio::io_service::work> _work; 
    ... 
}; 

/** 
 * CTOR 
 */ 
ServiceScheduler::ServiceScheduler() 
    : _io_service(), 
      _work(std::auto_ptr<boost::asio::io_service::work>( 
              new boost::asio::io_service::work(_io_service))), 
      _is_running(false) 
{ 
} 

/** 
 * Starts a thread to run async I/O service to process the scheduled work. 
 */ 
void ServiceScheduler::start() 
{ 
    ScopedLock scheduler_lock(_mutex); 
    if (!_is_running) { 
        _is_running = true; 
        _service_thread = boost::shared_ptr<boost::thread>( 
                new boost::thread(boost::bind( 
                        &ServiceScheduler::processServiceWork, this))); 
    } 
} 

/** 
 *  Processes work passed to the ASIO service and handles uncaught 
 *  exceptions 
 */ 
void ServiceScheduler::processServiceWork() 
{ 
    try { 
        _io_service.run(); 
    } 
    catch (...) { 
    } 
} 

/** 
 * Pre-fork handler 
 */ 
void ServiceScheduler::pre_fork() 
{ 
    _io_service.notify_fork(boost::asio::io_service::fork_prepare); 
} 

/** 
 * Post-fork parent handler 
 */ 
void ServiceScheduler::post_fork_parent() 
{ 
    _io_service.notify_fork(boost::asio::io_service::fork_parent); 
} 

/**
 * Post-fork child handler 
 */ 
void ServiceScheduler::post_fork_child() 
{ 
    _io_service.notify_fork(boost::asio::io_service::fork_child);
}

I'm using boost 1.47 and running the application on Solaris i386. The library and application are built using studio-12.0.

Sam Miller
  • 23,808
  • 4
  • 67
  • 87
ranadheer p
  • 271
  • 1
  • 6
  • Are you expecting to do anything other call exec() or _exit() in the child after you call fork? If so, you should reconsider. If not, I don't see the problem. – janm Mar 04 '12 at 03:27
  • You can reserve main thread for only administration, command interface tasks and parent-child handling. After fork, only main thread exists in the child. You can hold a internal configuration data for restore and create needed threads in child process. This way ensures a clean encapsulation, and avoids locking needs. – Mel Viso Martinez Apr 17 '15 at 08:51
  • 3
    After trying to use boost::asio for two projects, I reached the conclusion it is better not to use boost. It segfaults even on simple examples. Its complex template structure is excessively difficult to understand and impossible to meaningfully step through and identify even a probable a cause. – wallyk Jun 29 '15 at 19:04

2 Answers2

2

The asio code specifies that the notify_fork() do not work when there is any code in io_service code.

This function must not be called while any other io_service function, or any function on an I/O object associated with the io_service, is being called in another thread. It is, however, safe to call this function from within a completion handler, provided no other thread is accessing the io_service.

That appears to include run or any of the IO associated with the library. I think your pre_fork processing, should reset a work item.

e.g. from boost documentation

boost::asio::io_service io_service;
auto_ptr<boost::asio::io_service::work> work(
    new boost::asio::io_service::work(io_service));
...
pre_fork() {
  work.reset(); // Allow run() to exit.
  // check run has finished...
  io_service.notify_fork(...);
}

Care still needs to be taken

  1. Ensure run() is not called before post_fork() has completed.
  2. Ensure new work object is created for next run
  3. Proper synchronization to ensure run termination is spotted.
mksteve
  • 12,614
  • 3
  • 28
  • 50
0

You could use io_service::run_one to check if a fork is scheduled / the io_service should still be running. When a fork should be happening some work can be added to the io_service to make the thread to wake up. The thread checks the run condition and imediately stop. After the fork happened either the parent or the child can restart a worker thread.

/**
 * Combines Boost.ASIO with a thread for scheduling.
 */
class ServiceScheduler : private boost::noncopyable
{
public :
    /// The actual thread used to perform work.
    boost::shared_ptr<boost::thread>             _service_thread;

    /// Service used to manage async I/O events
    boost::asio::io_service                      _io_service;

    /// Work object to block the ioservice thread.
    std::auto_ptr<boost::asio::io_service::work> _work;
    ServiceScheduler();
    void start();
    void pre_fork();
private:
    void processServiceWork();
    void post_fork_parent();
    void post_fork_child();
    std::atomic<bool> _is_running;
};

/**
 * CTOR
 */
ServiceScheduler::ServiceScheduler()
    : _io_service(),
      _work(std::auto_ptr<boost::asio::io_service::work>(
              new boost::asio::io_service::work(_io_service))),
      _is_running(false)
{
}

/**
 * Starts a thread to run async I/O service to process the scheduled work.
 */
void ServiceScheduler::start()
{
    if(!_is_running) {
        _service_thread = boost::shared_ptr<boost::thread>(
                new boost::thread(boost::bind(
                        &ServiceScheduler::processServiceWork, this)));
    }
}

/**
 *  Processes work passed to the ASIO service and handles uncaught
 *  exceptions
 */
void ServiceScheduler::processServiceWork()
{
    try {
        while(_is_running) {
            _io_service.run_one();
        }
     }
    catch (...) {
    }
    _is_running = false;
}

/**
 * Pre-fork handler
 */
void ServiceScheduler::pre_fork()
{
    _is_running = false;
    _io_service.post([](){ /*no_op*/});
    _service_thread->join();
    _service_thread.reset();
    _io_service.notify_fork(boost::asio::io_service::fork_prepare);
}

/**
 * Post-fork parent handler
 */
void ServiceScheduler::post_fork_parent()
{
    start();
    _io_service.notify_fork(boost::asio::io_service::fork_parent);
}

/**
 * Post-fork child handler
 */
void ServiceScheduler::post_fork_child()
{
    _io_service.notify_fork(boost::asio::io_service::fork_child);
}
David Feurle
  • 2,687
  • 22
  • 38