2

I am running the following code on a Linux OS + ARM processor + boost 1.51. But, the code does not work as intended and the timed_wait() call returns immediately.

#include <boost/thread/condition.hpp>
#include <boost/thread/xtime.hpp>
#include <boost/thread/mutex.hpp>
#include <iostream>

using namespace std;

int main()
{
     boost::mutex mutex_;
     boost::mutex::scoped_lock lock( mutex_ );

     boost::xtime xt;
     boost::condition condition;

     // wait for one second or wait on lock
     boost::xtime_get(&xt, boost::TIME_UTC_);

     xt.sec += 1;

     cout << "Before 1 second wait" << endl;
     condition.timed_wait(lock, xt);
     cout << "After 1 second wait" << endl;

     return 0;
}

On other systems having the same ARM processor, but different version of Linux + glibc + same boost 1.51 libraries, the code works okay and waits for 1 second.

I tried to debug the issue using strace. I see a difference where the call to futex() is not made in the system where it is not working.

strace from a system where the code is working:

write(1, "Before 1 second wait\n", 21Before 1 second wait)  = 21
futex(0xb6fbf0dc, FUTEX_WAKE_PRIVATE, 2147483647) = 0
clock_gettime(CLOCK_REALTIME, {1438150496, 732211544}) = 0
futex(0xbef07a44, FUTEX_WAIT_PRIVATE, 1, {0, 998193456}) = -1 ETIMEDOUT (Connection timed out)
futex(0xbef07a28, FUTEX_WAKE_PRIVATE, 1) = 0
write(1, "After 1 second wait\n", 20After 1 second wait)   = 20

strace from a system where the code is NOT working:

    write(1, "Before 1 second wait\n", 21Before 1 second wait)  = 21
    futex(0xb6fc90dc, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    clock_gettime(CLOCK_REALTIME, {1438150407, 134963583}) = 0
    futex(0xbe9be988, FUTEX_WAKE_PRIVATE, 1) = 0
    write(1, "After 1 second wait\n", 20After 1 second wait)   = 20

Is there a kernel / glibc change that is needed to get this code working?

Bob
  • 31
  • 4
  • Try `predicate` version of timed_wait. Probably, it is spurious wake up which force early wake up. `condition.timed_wait(lock, boost::posix_time::milliseconds(1000), f);` where `f` is defined as `bool f(void) { return false; }`. – Tsyvarev Jul 29 '15 at 17:34
  • Thank you for the suggestion. But, the predicate does not help. I think, a lower level futex() call is not being made for some reason, as I mentioned in my question. – Bob Jul 30 '15 at 04:53
  • Hmm, it seems that your Boost installation is broken for some reason. It behaves "as if" it founds timeout already pass, so there is no needs to call `futex` for wait on condition. You can trace `boost::xtime_get()` call, look at timestamp it returns, and compare it to the one returned by `clock_gettime` at `condition.timed_wait()` call. – Tsyvarev Jul 30 '15 at 07:09

2 Answers2

0

Instead of mucking with clocks when using timeouts, why not use an actual timeout?

condition.timed_wait(lock,boost::posix_time::milliseconds(1000))

This prevents all sort of weird problems.

Mark Jansen
  • 1,491
  • 12
  • 24
  • 1
    The timed_wait call takes absolute value of time until which it will wait. In fact, I tried your suggestion, and the result is same. – Bob Jul 29 '15 at 09:26
0

I was able to figure out what was happening by changing the timeouts from 1 to 100.

I passed in the timeout as an argument to the program and used

xt.sec += timeout;

When the timeout is above 26, the program was waiting for (timeout - 26) seconds. In other words, the program waits for 1 second: if timeout is 27, 2 seconds: if timeout is 28, and so on...

The offset 26 is coming from the number of leap seconds. If the timezone information on the system contains leap second information, then, we are seeing this issue. If I change the timezone info (/etc/localtime) to point to a zoneinfo file that does not have leap second information, then, boost API works correctly.

Bob
  • 31
  • 4
  • I found very similar SO question: http://stackoverflow.com/questions/2784639/boost-timed-wait-leap-seconds-problem. Its author faced with same issue with boost 1.38 on RedHat. Actually, Boost documentation notes, that using UTC time for *future* timestamps is not a good idea: exactly because of leap seconds. `boost::chrono::seconds(1)` seems to be more appropriate for timeout interval. – Tsyvarev Jul 30 '15 at 12:31
  • Thank you for the comment and chrono class usage. I did not realize it to be a leap second issue when I first posted the query. – Bob Aug 11 '15 at 10:11