1

For a lightweight signal between threads on Windows, WaitOnAddress is the ideal tool starting from Windows 8. I have to support Windows 7 for the time being, and the best improvement I found (both searching around and reading MSDN) over a Win32 API Event is to go with SleepConditionVariableCS.

Here is a minimal example (handle Ctrl-C in a console app):

//HANDLE Signal;  // event
CONDITION_VARIABLE Signal;
CRITICAL_SECTION cs;

BOOL WINAPI CtrlHandler(DWORD ctl){
  switch (ctl) {
    case CTRL_C_EVENT:  /*SetEvent(Signal);*/ WakeConditionVariable(&Signal); return TRUE;
    default:            return FALSE; // pass it to the system
  }
}

int main(){

  //Signal = CreateEventA(NULL, TRUE /*manual-reset event*/, 
  //                       FALSE/*initial nonsignaled*/, "ctrl-c_sig023xgyI8");
  InitializeConditionVariable(&Signal);  InitializeCriticalSection(&cs);
  
  SetConsoleCtrlHandler(CtrlHandler, TRUE) // register CtrlHandler with the system
  // auto hThread = CreateThread(nullptr, 0, worker, nullptr, ..);
  
  //WaitForSingleObject(Signal, INFINITE);
  EnterCriticalSection(&cs);
    SleepConditionVariableCS (&Signal, &cs, INFINITE);
  LeaveCriticalSection(&cs);
  // stop thread, cleanup, CloseHandle(hThread); CloseHandle(Signal);

  return 0;
}

This example forces the "signal" use-case : the system starts a dedicated thread just to run your CtrlHandler (when the user enters Ctrl-C) which gets a chance then to signal the main thread for user-requested stop.

Is there a better (lighter) approach in the Win32 API or C++ (knowing that, of course, C++ has to ultimately go through the Win32 API).

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • 4
    There is always `https://en.cppreference.com/w/cpp/thread/condition_variable`, which probably does much the same as the code you have there. Why is 'lightweight' such a pressing problem? I doubt this approach uses a great deal of system resources. – Paul Sanders Aug 07 '23 at 22:15
  • In this minimal example, you could invent your own undebugged synchronization mechanism and probably wouldn't see a performance difference. "Lightweight" matters in real critical code (unfit for SO small examples). But I'll admit there is an academic aspect to it also. – Wasfi JAOUAD Aug 07 '23 at 22:28
  • 1
    I would use the features for this that you'll find in the C++ standard library. Perform profiling before trying to use these platform specific functions to make it more lightweight. Most of the time, the thing in the standard library is just a wrapper around a set of platform specific functions, packaged in a neat way to make them easy to use (and supports RAII all the way). – Ted Lyngmo Aug 07 '23 at 22:36
  • @Ted : you had to say it ! To be honest, I'm expecting no answer/"that's the lightest there is", and so I'll end up profiling anyway (but hey, you never know !). But I was more thinking about comparing Event/SleepConditionVariableCS/SleepConditionVariableSRW. You think C++ can do better ? It has to go through WinApi, and it can't possibly do less than a single call to the API like I do (OK, maybe two with the Wake function). – Wasfi JAOUAD Aug 07 '23 at 23:05
  • 2
    @WasfiJAOUAD I'm expecting those writing the standard library for a certain platform to do a decent job of not bloating object wrappers or slowing down the interaction with the underlying API. I do not expect things to be faster when using the C++ library but easier (a lot less code) and safer (RAII) - and I also expect these wrappers to be well tested, which is hard when it comes to threading. – Ted Lyngmo Aug 07 '23 at 23:08
  • @Ted : I totally agree (basically what Paul said in a more practical way in his "reasonable" comment). But we're not talking about best possible raw performance anymore (which is rarely needed, but that's what I'm asking here). – Wasfi JAOUAD Aug 08 '23 at 21:00
  • @WasfiJAOUAD Then, what except raw performance would be a good reason? – Ted Lyngmo Aug 08 '23 at 21:08
  • @Ted : what you mention : outside of a situation where best possible perf. is priority #1 (1% of code ? not even), I'm more than happy to use a "reasonably fast" API/Lib made by mature developers who care about bloat/RAII/thread safety/readability/.. I do care about those too. Here I'm asking about the 1%. Hope I'm clear. – Wasfi JAOUAD Aug 08 '23 at 21:29
  • A "reasonably fast" API is not always the really fast API before benchmarking. Also, it's not an optimization option to choose a really fast API. – YangXiaoPo-MSFT Aug 09 '23 at 08:13

2 Answers2

1

Doesn't look like there is faster than SleepConditionVariableCS/SRW.

And no much difference between these. This shows a slight advantage to SRW (~2%) on my CPU :

int main(int argc, char **argv){

  CONDITION_VARIABLE cv; InitializeConditionVariable(&cv);
  
  //SRWLOCK Signal; InitializeSRWLock(&Signal);
  CRITICAL_SECTION cs; InitializeCriticalSection(&cs);
  //std::mutex mtx;
  //std::condition_variable ccv;

  bool stop_thread = false;
  auto th1 = std::thread([&](){ while(true){ WakeConditionVariable(&cv); if(stop_thread) return; }});
  //auto th1 = std::thread([&](){ while(true){ ccv.notify_one(); if(stop_thread) return; }});

  const int64_t n = 1000 * 1000 * 1000;
  auto start = std::chrono::high_resolution_clock::now();
  for (int64_t i = 0; i < n; i++) {
    //AcquireSRWLockExclusive(&Signal);
    //  SleepConditionVariableSRW(&cv, &Signal, INFINITE, 0);
    //ReleaseSRWLockExclusive(&Signal);
    
    EnterCriticalSection(&cs);
      SleepConditionVariableCS (&cv, &cs, INFINITE);
    LeaveCriticalSection(&cs);
    
    //std::unique_lock lk(mtx);
    //ccv.wait(lk);
  }
  auto elapsed = std::chrono::high_resolution_clock::now() - start;
  stop_thread = true; th1.join();
  double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
  printf("%.3lf calls/sec\n", n / nSec);
  
  return 0;
}

std::condition_variable is a good 20% slower.

Bottom line : if you don't need recursion/you have low contention (no Try.. shenanigans for SRW), go with SRW (with NtCreateKeyedEvent special sauce), otherwise use CS.

---- EDIT ----

An other nice tool to consider (it's actually needed in my sample program !) is C++20's std::latch which can be "abused" into a "cooperative WakeOnAddress()" :

std::latch mainCanBeAwakened{1};

BOOL WINAPI CtrlHandler(DWORD ctl){
  ...
    case CTRL_C_EVENT:  mainCanBeAwakened.wait(); // user typing faster than light ?
       WakeConditionVariable(&Signal); return TRUE;
  ...
}

int main(){

  ...  
  EnterCriticalSection(&cs);
    mainCanBeAwakened.count_down(); // call twice (> mainCanBeAwakened.n) for UB !
    SleepConditionVariableCS (&Signal, &cs, INFINITE);
  LeaveCriticalSection(&cs);
  ...
}

main() has to initialize, start workers, .. and wait for Cntl-C last. If it takes time doing that, a Cntl-C can come before the condition variable is Sleeped on, and the WakeConditionVariable() would have no effect on the main() thread. So the CtrlHandler() needs to know that the CV is "altered" (Sleeped on by main()), said differently : it needs to "WakeOnAddress(CV)".

As you can see above, this can be done very economically (resources & code) using a std::latch (which is even lighter than std::barrier) in the absence of WakeOnAddress & SynchronizationBarrier (available in Windows 8+) : InitializeSynchronizationBarrier(2) / EnterSynchronizationBarrier in both threads / DeleteSynchronizationBarrier.

---- EDIT ----

As per Alex's answer, you can further drop CV and CS ! (on Win7 in particular)

I don't think it can possibly get lighter than a naked exclusive Slim RW lock.

It is wiser, though, to let down the latch if you have a dozen locks juggled in a lively thread pool.

  • In MSVC, `std::latch` / `std:;barrier` / `std::semaphore` will use `std::atomic::wait`, which will turn into `WaitOnAddress` on Win8+ and `SRWLOCK` + `CONDITION_VARIABLE` on Win7 – Alex Guteniev Aug 18 '23 at 07:40
  • @Alex: you're exactly right. I just traced back `std:latch` (`include/atomic`) to the beginning of `xatomic_wait.h` in VS19 (`__std_atomic_`* class & functions, starting line 22). It clearly says : `WaitOnAddress` or fallback to `SRWLOCK` and `CONDITION_VARIABLE` (line 41). It's a no-brainer than : `std::latch/barrier` all the way (much nicer syntax, and safer with destructors & all), unless profiling says otherwise. – Wasfi JAOUAD Aug 19 '23 at 09:56
  • I've participated in the implementation of MSVC [`atomic::wait`](https://github.com/microsoft/STL/pull/593) and [`latch`, etc](https://github.com/microsoft/STL/pull/1057) so I know :-) Indeed, C++ facilities provide higher abstraction level, and are also portable. – Alex Guteniev Aug 19 '23 at 10:55
  • 1
    Oh ! So you're the "mature developers who care about bloat/RAII/thread safety/.." and you also came here to cover the "1% of code or less" case I mention to Ted in the comments above !! Than you for proving me right when I commented "I expect no answer, but hey, you never know !" :)) – Wasfi JAOUAD Aug 19 '23 at 11:04
1

If you're looking for a complaint way, you can have Event or Semaphore object, backed by an atomic, with possible spinning. You'll have the part of lightweight-ness of WaitOnAddress that comes from the userspace ops. You still won't save an OS object though.


If you're fine with a hacky way, that is against the documentation and is not future-proof, then SRWLOCK is known to behave as a binary semaphore, implying that it can be released not in acquired thread. There's AppVerifier check for that, but still this currently works with this check disabled.

See Can SRW Lock be used as a binary semaphore?

So,

SRWLOCK lock = SRWLOCK_INIT;

BOOL WINAPI CtrlHandler(DWORD ctl){
  switch (ctl) {
    case CTRL_C_EVENT:  ReleaseSRWLockExclusive(&lock); return TRUE;
    default:            return FALSE; // pass it to the system
  }
}

int main(){

  //Signal = CreateEventA(NULL, TRUE /*manual-reset event*/, 
  //                       FALSE/*initial nonsignaled*/, "ctrl-c_sig023xgyI8");

  AcquireSRWLockExclusive(&lock);

 
  SetConsoleCtrlHandler(CtrlHandler, TRUE) // register CtrlHandler with the system
  // auto hThread = CreateThread(nullptr, 0, worker, nullptr, ..);
  
  AcquireSRWLockExclusive(&lock); // again, will be acquired when released in `CtrlHandler`

  return 0;
}
Alex Guteniev
  • 12,039
  • 2
  • 34
  • 79
  • Live and learn. I was saying C++ has better syntax, but this actually puts WinAPI back on the map! I still would go with C++ (nicer & safer). But I wouldn't hesitate to use this "hack" for a small gadget/if other developers won't interact with my code. It's not a hack when you know for a fact that the OS won't move anymore (my question is about Win7 specifically) ! – Wasfi JAOUAD Aug 19 '23 at 10:13