Can boost::atomic really improve performance by reducing overhead of sys calls (in mutex/semaphore) in multithreading?

Question

I am trying to compare the performance of boost::atomic and pthread mutex on Linux:

 pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER ;
 int g = 0 ;

 void f()
 {

    pthread_mutex_lock(&mutex);
    ++g;
    pthread_mutex_unlock(&mutex);
    return ;
 }
 const int threadnum = 100;
 int main()  
 {
    boost::threadpool::fifo_pool tp(threadnum);
    for (int j = 0 ; j < 100 ; ++j)
    {
            for (int i = 0 ; i < threadnum ; ++i)
                    tp.schedule(boost::bind(f));
            tp.wait();
    }
    std::cout << g << std::endl ;
    return 0 ; 
 }

its time:

 real    0m0.308s
 user    0m0.176s
 sys     0m0.324s

I also tried boost::atomic:

 boost::atomic<int> g(0) ;

 void f()
 {

      ++g;
    return ;
  }
  const int threadnum = 100;
  int main()
  {
    boost::threadpool::fifo_pool tp(threadnum);
    for (int j = 0 ; j < 100 ; ++j)
    {
            for (int i = 0 ; i < threadnum ; ++i)
                    tp.schedule(boost::bind(f));
            tp.wait() ;
    }
    std::cout << g << std::endl ;
    return 0 ;
   }

its time:

 real    0m0.344s
 user    0m0.250s
 sys     0m0.344s

I run them many times but the timing results are similar.

Can atomic really help avoid overhead of sys calls caused by mutex/semaphore ?

Any help will be appreciated.

Thanks

UPDATE : increase the loop number to 1000000 for

    for (int i = 0 ; i < 1000000 ; ++i)
    {
            pthread_mutex_lock(&mutex);
            ++g;
            pthread_mutex_unlock(&mutex);
    }

similar to boost::atomic .

test the time by "time ./app"

use boost:atomic:

real    0m13.577s
user    1m47.606s
sys     0m0.041s

use pthread mutex:

real    0m17.478s
user    0m8.623s
sys     2m10.632s

it seems that boost:atomic is faster because pthread use more time for sys calls.

Why user time + sys is larger than real time ?

Any comments are welcome !

I've tried it using Frederic's suggestion and the std::atomic(well, not boost :D) runs about 25% faster. Here's my code: http://ideone.com/4P1vO — mfontanini, Jun 10 '12 at 16:05
Well, it seems to be more realistic, but I would expect atomics to be way faster than 25%. Maybe 10 000 iterations is not enough to get rid of the thread pool overhead? — Frédéric Terrazzoni, Jun 10 '12 at 16:09
@mfontanini, thanks for your code, you must use C++11, which is not available to me. So, I used boost. Would you please teest it with boost::atomic ? thanks ! — user1000107, Jun 10 '12 at 16:28
I used `std::atomic` since I don't have `boost::atomic`, but you should get similar results anyway. — mfontanini, Jun 10 '12 at 17:20
Your algorithm measures nothing more than the time to increment some number. Effective use of atomics relates to the entire design of an algorithm and not just a simple change. That is, you aren't measuring any meaningful value in this test. — edA-qa mort-ora-y, Jun 10 '12 at 18:15

score 5 · Accepted Answer · answered Jun 10 '12 at 15:27

I guess you're not correctly measuring the time taken by atomics vs mutexes. Instead, you're measuring the overhead incurred by the boost thread pool management: it takes more time to setup a new task f() than executing the task itself.

I suggest you add another loop in f() to obtain something like this (do the same for the atomic version)

 void f()
 {
    for(int i = 0  ; i < 10000 ; i++) {
      pthread_mutex_lock(&mutex);
      ++g;
      pthread_mutex_unlock(&mutex);
    }
    return ;
 }

Please post the score if something changed, I'd interested to see the difference !

Can boost::atomic really improve performance by reducing overhead of sys calls (in mutex/semaphore) in multithreading?

1 Answers1