1

Are there lightweight, cross-platform alternatives to Win32 CRITICAL_SECTION for C++? I am trying to make my Windows application platform agnostic, but std::recursive_mutex is way slower than CRITICAL_SECTION. I am currently using Visual Studio 2013 Community.

meriken2ch
  • 409
  • 5
  • 15
  • Unfortunately, I cannot use `std::mutex` because it is not recursive. Besides, `std::mutex` is as slow as `std::recursive_mutex`. – meriken2ch May 01 '16 at 07:35
  • 1
    you should explore std::atomic – zeromus May 01 '16 at 07:38
  • I tried `std::atomic`, but I got strange BEX/BEX64 exceptions. It would be great if you could show me some codes, – meriken2ch May 01 '16 at 07:43
  • Some codes of what? How do you measure slower/faster? – Ivan Aksamentov - Drop May 01 '16 at 07:59
  • A C++ code that demonstrates std::atomic can be used in lieu of CRITICAL_SECTION? As for measurement, the performance of my number-crunching application dropped by 50% after replacing CRITICAL_SECTION with std::recursive_mutex. – meriken2ch May 01 '16 at 08:33
  • 2
    Maybe you shouldn't use mutexes in performance-critical loops? Every mutex lock has the potential for a context switch. If you now split your task into too many small pieces, your performance degrades, because too often threads compete for the mutex. It's hard to tell without knowing the code, and it also doesn't fully explain the difference to the `CRITICAL_SECTION` (which I'd `std::mutex` to wrap, actually). – Ulrich Eckhardt May 01 '16 at 09:10
  • The way this question is asked, is off-topic for SO (recommendation question). That being said, I'm sure Intel's [Thread Building Blocks](https://www.threadingbuildingblocks.org/docs/help/reference/synchronization/ppl_compatibility/critical_section.htm) has exactly what you're looking for. If not, look at [Boost](http://www.boost.org/doc/libs/release/doc/html/thread/synchronization.html). – rubenvb May 01 '16 at 09:17
  • 2
    std::recursive_mutex is built on top of the concurrency runtime in the Microsoft CRT. The layering is a bit heavy, hard to compete with a dedicated OS primitive. Seeing a 50% perf drop is scary however, you must slam that mutex very hard. Such fine-grained locking is almost never not a problem. Nothing that an #ifdef couldn't work around I suppose. – Hans Passant May 01 '16 at 09:21
  • Hans' comment makes sense. I am building Boost now. Hope this works better. – meriken2ch May 01 '16 at 09:41
  • Alternative? Redesign to do less inside the CS, so reducing the probability of contention. – Martin James May 01 '16 at 09:58

2 Answers2

1

You should have a look at the Boost.Thread library and boost::recursive_mutex in particular.

(also see How do I make a critical section with Boost?)

Community
  • 1
  • 1
Florian
  • 255
  • 1
  • 9
  • Feels like using a sledgehammer to crack a nut, but why not... We will see. – meriken2ch May 01 '16 at 09:48
  • Depending on your actual needs, more lightweight alternatives are of course available. If you just need to protect some shared resources, [boost::atomic](http://www.boost.org/doc/libs/1_59_0/doc/html/atomic/usage_examples.html) (or std::atomic if you can use c++11 features) might be an option too. – Florian May 01 '16 at 11:28
  • 1
    The [boost::detail::lightweight_mutex](http://www.boost.org/doc/libs/1_60_0/boost/detail/lightweight_mutex.hpp) is probably what you are looking for: it implements a subset of the Mutex concept requirements and maps to a CRITICAL_SECTION on Windows or a pthread_mutex on POSIX systems. – Florian May 01 '16 at 12:16
  • The boost::detail::lightweight_mutex would have been perfect if it were recursive on POSIX systems. – meriken2ch May 01 '16 at 20:01
0

http://en.cppreference.com/w/cpp/atomic/atomic_flag "A spinlock mutex can be implemented in userspace using an atomic_flag"

I adapted their userspace spinlock mutex to allow recursive locks.

Warning: speedcoded synchronization logic should be assumed faulty until tested in a thousand battles, and also carefully coded synchronization logic

#include <thread>
#include <vector>
#include <iostream>
#include <atomic>

std::atomic_flag lock = ATOMIC_FLAG_INIT;
std::thread::id current_thread;
volatile int counter;

void lockme()
{
    for(;;)
    {
        //protect access to current_thread and counter
        while (lock.test_and_set(std::memory_order_acquire))
        {}

        //use current_thread and counter

        //the thread with the conceptual lock is in current_thread
        //if that's this thread, or no such thread, make sure this thread has the lock and increment counter
        auto myid = std::this_thread::get_id();
        if(current_thread == myid || current_thread == std::thread::id())
        {
            counter++;
            current_thread = myid;
            lock.clear(std::memory_order_release);
            return;
        }

        lock.clear(std::memory_order_release);
    }
}

void unlockme()
{
    for(;;)
    {
        //protect access to current_thread and counter
        while (lock.test_and_set(std::memory_order_acquire))
        {}

        //use current_thread and counter

        //if this thread has the conceptual lock, perform the unlock
        //otherwise try again
        auto myid = std::this_thread::get_id();
        if(current_thread == myid)
        {
            counter--;
            if(counter==0)
                current_thread = std::thread::id();
            lock.clear(std::memory_order_release);
            return;
        }

        lock.clear(std::memory_order_release);
    }

}

void f(int n)
{
    for (int cnt = 0; cnt < 100; ++cnt) {
        for (int j = 0; j < 10; j++) lockme();
        std::cout << "Output from thread " << n << '\n';
        for (int j = 0; j < 10; j++) unlockme();
    }
}

int main()
{
    std::vector<std::thread> v;
    for (int n = 0; n < 10; ++n) {
        v.emplace_back(f, n);
    }
    for (auto& t : v) {
        t.join();
    }
}
zeromus
  • 1,648
  • 13
  • 14