Why does sleep_for(nanoseconds(1)) improve performance so much in a busy-wait loop?

Question

Question

In implementing a simple semaphore, I tried to calculate the average time spent in the busy-wait loop and sleeping for that amount of time in nanoseconds. Either way, sleeping for 1 nanosecond seemed to achieve the same result: there is a huge speed-up (about 6x faster) when I call...

std::this_thread::sleep_for(std::chrono::nanoseconds(1));

...in the busy-wait loop of the semaphore's P() function.

I believe that the sleep is greatly reducing contention, but I'd like someone more knowledgeable to give me a more certain answer.

Using std::this_thread::yield(); is more beneficial in execution time than simply spinlocking, but still has a high CPU usage (90-100%). The sleep method keeps it at 10-15%; threads still interleave, so it isn't that one thread executes all of its instructions in a row.

Code

I've minimized this example as much as possible to highlight the relevant part of code. I've provided a demo link with the full implementation at the end of the question.

#include <atomic>
struct semaphore
{
public:
    explicit semaphore( int const max_concurrency );

    void P()
    {
        // why does this sleep improve performance so much?
        while ( !try_decrease_count() )
            std::this_thread::sleep_for( std::chrono::nanoseconds( 1 ) );
    }

    void V()
    {
        count_.fetch_add( 1 );
    }

private:
    bool try_decrease_count();

    std::atomic<int> count_;
};

semaphore::semaphore( int const max_count )
    : count_{ max_count }
{}

bool semaphore::try_decrease_count()
{
    int old_count{ count_.load() };
    do
    {
        if ( !old_count ) return false;
    } while ( !count_.compare_exchange_strong( old_count, old_count - 1 ) );
    return true;
}

Demonstrations

These include the full code along with tests.

Demo with sleep_for: http://coliru.stacked-crooked.com/a/ff16411d9a884556

Demo without sleep_for: http://coliru.stacked-crooked.com/a/2df776f0a03425b1

Do you get the same benefit from calling [`yield`](http://en.cppreference.com/w/cpp/thread/yield)? — Tony Delroy, Oct 30 '15 at 05:21
[Have a look here](http://stackoverflow.com/questions/17325888/c11-thread-waiting-behaviour-stdthis-threadyield-vs-stdthis-thread). Explains the behavior of the sleep. — Totonga, Oct 30 '15 at 05:30
@TonyD yield does not provide as much benefit as `sleep_for(...)`, but is still faster than not sleeping/yielding at all. — user2296177, Oct 30 '15 at 05:39
@GregorMcGregor Thanks for the suggestion, I'll try that out. As I told TonyD, yield does not provide as much benefit as sleep_for, so I wonder what's going on. — user2296177, Oct 30 '15 at 05:39
@GregorMcGregor Fixed the microsecond confusion; I had just been testing it out with different units to see if there were differences. — user2296177, Oct 30 '15 at 05:58

Why does sleep_for(nanoseconds(1)) improve performance so much in a busy-wait loop?

Question

Code

Demonstrations

0 Answers0