1

I have several (managed / .NET) processes communicating over a ring buffer which is held in shared memory via the MemoryMappedFile class (just memory no file mapped). I know from the SafeBuffer reference source that writing a struct to that memory is guarded by a CER (Constrained Execution Region) but what if the writing process gets abnormally terminated by the OS while doing so? Can it happen that this leads to the struct being written only partially?


    struct MyStruct
    {
      public int A;
      public int B;
      public float C;
    }

    static void Main(string[] args)
    {
      var mappedFile = MemoryMappedFile.CreateOrOpen("MyName", 10224);
      var accessor = mappedFile.CreateViewAccessor(0, 1024);
      MyStruct myStruct;
      myStruct.A = 10;
      myStruct.B = 20;
      myStruct.C = 42f;
      // Assuming the process gets terminated during the following write operation.
      // Is that even possible? If it is possible what are the guarantees   
      // in regards to data consistency? Transactional? Partially written?
      accessor.Write(0, ref myStruct); 
      DoOtherStuff(); ...
    }

It is hard to simulate / test whether this problem really exists since writing to memory is extremly fast. However, it would certainly lead to a severe inconsistency in my shared memory layout and would make it necessary to approach this with for example checksums or some sort of page flipping.

Update:

Looking at Line 1053 in

https://referencesource.microsoft.com/#mscorlib/system/io/unmanagedmemoryaccessor.cs,7632fe79d4a8ae4c

it basically comes down to the question whether a process is protected from abnormal termination while executing code in a CER block (having the Consistency.WillNotCorruptState flag set).

Thomas Zeman
  • 890
  • 1
  • 7
  • 16
  • 2
    No, it is an OS object and it knows that the RAM pages are mapped. You can lose data if the OS can't finish the job either, power loss for example. – Hans Passant Feb 17 '18 at 14:52
  • @Hans, I have added a code example demonstrating my question/problem. For me it is not about losing data it is about partially written data when the process which is writing the data gets terminated while writing. – Thomas Zeman Feb 17 '18 at 16:12
  • If you can't be sure that the Write() completed successfully before the program crashes/terminates then you of course don't know anything at all. The Write() is not atomic. I suspect you'll find it useful to dedicate a byte in the MMF that indicates "busy writing". Set it to 1 before the Write, to 0 after, now you have a fact. – Hans Passant Feb 17 '18 at 16:24
  • Why are other processes accessing a region (bigger than 64 bits) at *any time* and assuming consistency? It sounds like you have a broken protocol (or a non-existent one that needs to exist) – Damien_The_Unbeliever Feb 17 '18 at 16:39
  • An MMF is not necessarily useful for process interop. In fact one of its least practical usages by a long shot. – Hans Passant Feb 17 '18 at 17:05
  • @Hans yes, that is the direction I was heading for. Unfortunately this adds a lot of complexity that would not be necessary with the right Write guarantees and what these are is basically my question. – Thomas Zeman Feb 17 '18 at 17:10
  • @Damien There are certainly easy ways to implement a simple protocol with e.g. Mutexes, Events or Semaphores to achieve consistency among participating processes. However, it all gets very tricky if it can happen that one process writes only a part of whatever it wants to write. If you have any robust algorithm implementing a ring buffer (1 process consuming, n processes writing) under this condition please share. – Thomas Zeman Feb 17 '18 at 17:11
  • Hmm, you are trying to make "least practical usages" practical. It has been done already, message queue and service bus libraries abound. – Hans Passant Feb 17 '18 at 17:20
  • @Hans for example: https://github.com/spazzarama/SharedMemory/blob/master/SharedMemory/CircularBuffer.cs uses shared memory to do IPC. I think this particular solution would also have problems with a process terminating while writing to the buffer – Thomas Zeman Feb 17 '18 at 17:20
  • Certainly I am aware of message buses, queues or named pipes. This is not a question about my software design or what I am trying to achieve but simply a question about guarantees when working with shared memory. – Thomas Zeman Feb 17 '18 at 17:22
  • For IPC usage you never actually care about the file, only the data. And create the MMF backed by the paging file. When one process crashes/terminates while writing then the show is always over. It will own the sync object that gives write access, it is not going to be reset. Truly reliable multi-process implementations always require a "guard" process, one that can see that one of the participants fell over unexpectedly. A server is a common choice to be that reliable arbiter. – Hans Passant Feb 17 '18 at 17:32
  • I take this as an yes - the write (copy that is) can be interrupted by an abnormal process termination. Even though Write ( https://referencesource.microsoft.com/#mscorlib/system/io/unmanagedmemoryaccessor.cs,7632fe79d4a8ae4c ) calls a CER protected - "out of band exception" save code block in line 1053. – Thomas Zeman Feb 17 '18 at 17:48

1 Answers1

1

Yes a process can be stopped at any moment.

The SafeBuffer<T>.Write method finally calls into

[MethodImpl(MethodImplOptions.InternalCall)]
[ResourceExposure(ResourceScope.None)]
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
private static extern void StructureToPtrNative(/*ref T*/ TypedReference structure, byte* ptr, uint sizeofT);

which will do basically a memcpy(ptr, structure, sizeofT). Since unaligned writes are never atomic except for bytes you will run into issues if your process is terminated in the middle while writing a value.

When a process is terminated the hard way via TerminateProcess or an unhandled exception no CERs or something related is ever executed. There is no graceful managed shutdown happening in that case and your application can be stopped right in the middle of an important transaction. Your shared memory data structures will be left in an orphaned state and any locks you might have taken will return the next waiter in WaitForSingleObject WAIT_ABANDONED. That way Windows tells you that a process has died while it had taken the lock and you need to recover the changes done by the last writer.

Alois Kraus
  • 13,229
  • 1
  • 38
  • 64