Boost::Thread function leading to a segmentation fault on an embedded ARM

Question

I'm having a weird problem with a threaded class using Boost::threads. Here is a brief summary of what I'm doing:

A routine creates a bunch of objects that are made up of a handler class with a private data member that is a shared pointer to a base class that forms an inheritance tree. I'm fairly confident this routine is working correctly, and not part of the problem.

I then call a method of the handler class (startUpdate) which creates a new instance of my threaded class. Here's the threaded class code:

class Sensor_Thread
{
  public:
    //constructor (creates thread and binds the update function to it
    Sensor_Thread (const Ptr<Sensor_Base> & theSensor): m_stoprequested (false),
                      s (theSensor),
                      m_thread (boost::bind (&Sensor_Thread::update, this)) { }
    //default null constructor, shouldn't ever be used
    Sensor_Thread (): m_stoprequested (true),
                      m_thread (),
                      s (NULL) { }

    //destructor (automatically joins the thread as per RAII principles)
    ~Sensor_Thread () { m_stoprequested = true; m_thread.join (); }

  private:
    volatile bool m_stoprequested;
    boost::mutex m_mutex;
    boost::thread m_thread;
    Ptr<Sensor_Base> s;

    void update ();

};

(the "Ptr" class there is my shared pointer class... I'm fairly confident it works correctly because I originally got it from a C++ textbook...)

The update function:

void Sensor_Thread::update ()
{
  //make sure we actually have a sensor attached...
  if (s) { 
    // set up structure for sleeping
    struct timespec time;
    while (!m_stoprequested)
    {
      boost::mutex::scoped_lock lock(m_mutex);
      s->update ();
      time.tv_sec = s->updateInterval / 1000;
      time.tv_nsec = (1000 % s->updateInterval) * (1000 * 1000);
      nanosleep (&time, NULL);
    }
  }
}

This runs indefinitely until another trigger in the driver calls stopUpdate and the threaded_class is destroyed.

The weirdness: On my development box, which is OS X 10.6, using darwin gcc 4.2.1, it works fine, exactly as expected.

This is meant to run on an embedded server with debian linux and an ARM processor. I have a cross-compile toolchain provided by the manufacturer of the embedded system, and when I use this to cross-compile, I get a seg fault. Through debugging I've found this seg fault occurs when s->update () is called (or ANY other attempt to dereference the shared pointer and do something to it). However, if I introduce a slight delay, say by adding "sleep(1);" before starting the while loop in my Sensor_Thread::update function, it works flawlessly.

To my mind, this seems to imply that the system is trying to dereference the shared pointer s before it's fully or adequately initialized? The sleep(1) work-around makes it work, but still, that seems extremely weird to me. If the threaded class' shared pointer is initialized during the constructor, shouldn't it be ready before the update function ever gets called? Or does the creation of the boost::thread mean that the update function happens simultaneously to the initialization of the shared pointer owned by the threaded class? Is there any cleaner way than the "sleep" hack to make SURE the shared pointer is initialized before the update function is called?

Thanks!!!

Your faith in textbooks is charming. :) – sarnold Jan 20 '12 at 00:41 — sarnold, Jan 20 '12 at 00:41

score 3 · Accepted Answer · answered Jan 20 '12 at 00:30

Sensor_Thread (const Ptr<Sensor_Base> & theSensor): m_stoprequested (false),
                  s (theSensor),
                  m_thread (boost::bind (&Sensor_Thread::update, this)) { }

This code is broken. You are calling update on an object that isn't constructed yet. Using this in the initialization list of a constructor should always raise a red flag. It's a pointer to an object that doesn't yet fully exist.

The usual way to handle this is to split it into two steps. Have a run or start method that creates the thread. Call that method after the constructor returns.

while (!m_stoprequested)
{
  boost::mutex::scoped_lock lock(m_mutex);
  s->update ();
  time.tv_sec = s->updateInterval / 1000;
  time.tv_nsec = (1000 % s->updateInterval) * (1000 * 1000);
  nanosleep (&time, NULL);
}

This is probably not what you want. It holds the mutex always, making it very hard for another thread to ever access s. It will have to wait until the next update and then win the race for the mutex. On some platforms, a thread that is doing real work will have a hard time beating an 'interactive' thread (one that mostly sleeps). So this could slow down any other thread that tries to access s by a significant amount.

Why do you call nanosleep while holding the mutex?

Thanks! Should be easy enough to add a starter method. As you can probably tell, I'm new to multithreaded programming, I'm basing this off an example I found online... I am probably a bit confused about the mutex. My goal is to not let anything else have access to the object s points to (s is a shared pointer) while s->update () method runs. The nanosleep function is just setting the update interval, and definitely doesn't have to be inside the mutex... Any insight on how to best achieve my goal? — bencpeters, Jan 20 '12 at 00:55
You could eliminate the scoped_lock and just lock the mutex before `update` and unlock it afterwards. Alternatively, you could put the scoped_lock and the `update` call inside an `if 1` block. (If other threads *modify* `s`, put the two calculations inside the lock too. Otherwise, it doesn't much matter either way, though the general rule is to hold a mutex for as few instructions as possible.) — David Schwartz, Jan 20 '12 at 02:45

Boost::Thread function leading to a segmentation fault on an embedded ARM

1 Answers1