7

We have some data structures that we are sharing across processes on Windows. (Via a shared data segment in a DLL that's loaded by all these processes.)

We need to synchronize some accesses and we measured that the performance hit of using a Win32 Mutex is too costly.

CRITICAL_SECTION cannot be put into shared memory due to some of it's advanced features.

This leaves us with the requirement of a simple locking/mutex solution based directly on the Interlocked* family of function on Win32.

Before rolling my own I'd like to see if there's robust implementations out there that handle the requirement of being lightweight, fast and working in shared memory for multiple processes, but it seems that this is something that's a tad hard to google for me. (And, anyway, the CodeProject hits, well it's often hard to tell whether it's toy code or "robust".)

So what I'd need could probably be called a user-mode recursive mutex that works for multiple processes when put in shared memory on Windows (note that only the locking part needs to be handled savely, I can live with restrictions / additional requirements for initialization).

Martin Ba
  • 37,187
  • 33
  • 183
  • 337
  • 1
    What's wrong with system provided mutexes? Critical sections are implemented on top of those, you know. – Seva Alekseyev Nov 23 '12 at 15:48
  • 4
    No, they're not. CRITICAL_SECTIONS are intra-process only. Mutexes are cross-process. It makes critical sections a lot cheaper to use, especially when there's low contention. – Puppy Nov 23 '12 at 15:51
  • 1
    @SevaAlekseyev Mutex involves stepping between user-mode and kernel-mode because that's where Mutexes (and most concurrency stuff) lives. That's expensive. CRITICAL_SECTION is very lightweight and lives outside kernel land - but that's also why it can't live in shared memory (as noted in the article linked to by the OP). 'Critical Section Object/CRITICAL_SECTION' the implementation is not 'Critical Section' the concept. See http://msdn.microsoft.com/en-gb/library/windows/desktop/ms682530(v=vs.85).aspx – Rushyo Nov 23 '12 at 15:54
  • 1
    If it were possible to write an inter-process lock that is lightweight and robust, then probably Mutex would be written that way. But it sort of depends what you mean by "robust", Mutex has some features that you might not need. – Steve Jessop Nov 23 '12 at 15:59
  • @Steve - Well I don't have generic inter-process requirements. I already have shared memory set up and also I don't need specific inter-process initialization as the init part could be handled by an additional win32 mutex if neccessary. – Martin Ba Nov 23 '12 at 21:02
  • 1
    AFAIK, when a wait need to be performed on a critical section, an unnamed mutex is silently created. A wait operation involves a user/kernel transition, by necessity. When there's no contention, it's user only. Source: Solomon/Russinovich. – Seva Alekseyev Nov 23 '12 at 21:04

1 Answers1

1

Shared memory is a popular topic currently,

Try boost::InterProcess - which provides mechanisms that could be used and utilizes common code x-platform.

http://www.boost.org/doc/libs/1_52_0/doc/html/interprocess/sharedmemorybetweenprocesses.html

The other reason is that the library provides mechanisms for synchronisation and other IPC mechanisms that may be useful in the future.

http://www.boost.org/doc/libs/1_52_0/doc/html/interprocess/synchronization_mechanisms.html

For reference the thing uses Atomic OPs as well for the mutex:

http://www.boost.org/doc/libs/1_52_0/boost/interprocess/sync/spin/mutex.hpp

inline void spin_mutex::lock(void)
{
   do{
      boost::uint32_t prev_s = ipcdetail::atomic_cas32(const_cast<boost::uint32_t*>(&m_s), 1, 0);

      if (m_s == 1 && prev_s == 0){
            break;
      }
      // relinquish current timeslice
      ipcdetail::thread_yield();
   }while (true);
}

Also from the "chat below" this post look at the top answer for : Is there a difference between Boost's scoped mutex and WinAPi's critical section?

Community
  • 1
  • 1
Caribou
  • 2,070
  • 13
  • 29
  • I'm looking at this and wondering why it would be any more performant. It's a cross-platform wrapper and everything I'm reading there about Windows behaviour screams less efficient, not more (disk I/O? wrappers around kernel primitives?). POSIX-like support is nice but not what the OP is after. – Rushyo Nov 23 '12 at 16:01
  • @Rushyo it uses atomic ops and has recursive mutexes – Caribou Nov 23 '12 at 16:06
  • And this makes it more performant for the OP... how? A Windows kernel mutex is recursive by default (someone correct me if I'm wrong, source: http://www.ibm.com/developerworks/linux/library/l-ipc2lin3/index.html) but the issue is the context-switch. – Rushyo Nov 23 '12 at 16:11
  • "everything I'm reading there about Windows behaviour screams less efficient" Have you tried it? And anyway it's an option - boost is used professionally by lots of people. It is basically what he asked for - but even if it is less performant in this guise he could use it as a template for rolling his own – Caribou Nov 23 '12 at 16:15
  • Nothing about the original post makes me think he can't write his own without assistance (it's heavily implied he can). I just see no reason for this to be faster. It doesn't make any pretense towards speed, it's for POSIX compatibility. As for why I haven't tried it: I'm not the OP. I don't have his case to test against. CodeProject is used professionally by lots of people too (alas). It doesn't mean it's helpful to post snippets from there without a rationale since most of them aren't helpful to these specific cases. – Rushyo Nov 23 '12 at 16:18
  • @Rushyo Tell you what I think I may have read your first comment a bit too quickly perhaps - I understand what you are saying now. I was a bit too defensive I think - yes you are right in what you are saying it's just these comments are not the forum for discussion and I was trying to do 2 things at once – Caribou Nov 23 '12 at 16:20
  • No problem. I was genuinely intrigued anyway, this topic is out of my comfort zone. Looking at: http://stackoverflow.com/questions/877577/is-there-a-difference-between-boosts-scoped-mutex-and-winapis-critical-section It seems like Boost calls in to WaitForSingleObject on the events, which results in a kernel-context switch. Therefore I would expect it to be less performant. – Rushyo Nov 23 '12 at 16:26
  • @Rushyo I think that it is a CAS operation using the interlocked commands - basically a spin lock for windows as well - Thats why I posted it (Unless you're saying that the Wait is being called under the hood in that call?) – Caribou Nov 23 '12 at 16:29
  • Recursive mutex not a good idea according to its creator. http://www.zaval.org/resources/library/butenhof1.html – DumbCoder Nov 23 '12 at 16:47
  • @DumbCoder maybe - but "user-mode recursive mutex that works for multiple processes when put in shared memory on Windows" from the OP – Caribou Nov 23 '12 at 16:55
  • Using something named "xxxdetail" seems like a bad start to me ;) – J.N. Nov 23 '12 at 16:56
  • Accepting answer for the link to: http://stackoverflow.com/questions/877577/is-there-a-difference-between-boosts-scoped-mutex-and-winapis-critical-section – Martin Ba Nov 19 '13 at 12:26