0

I have a program that is a C# wrapper for a program originally written in C/C++. The code goes through a SWIG process make the dlls accessible. Recently this code has started to crash intermittently with an access violation, but only when run in parallel with other similar code, suggesting a shared resource. However, there is no clear shared resource.

Also, the program always seems to complete (with proper output) and the access violation occurs on exit (when stepping through the code with VS19 the intermittent access violation occurs after the final return statement). I have tried putting handlers for AppDomain.ProcessExit and for UnhandledException, but it never hits that code. Even when VS19 reports an access violation in the output window, it always reports that the code has exited with 0, so I can't even be sure if this is the access violation being reported to the OS.

So...running from the command line I was able to get it to crash intermittently with rc -1073741819 (0xc0000005).

Running procdump.exe I was able to get a dump of the crash, with this error: Unhandled exception at 0x00007FFB0FDDFCA0 (iertutil.dll) in XXXXXXXXXXXXXXX.exe_220719_142518.dmp: 0xC0000005: Access violation reading location 0x0000000000000000.

The call stacks are not very revealing. While analyzing the dump in Visual Studio it says that symbols are loaded for all of my dlls, but when I look at the call stack of the access violation, it only shows me a very limited stack of the windows calls (see link).

call stacks from the dump after the access violation

The actual access violation is in _purecall, but again this crash is occurring after the return statement in my C# main. I cannot even figure out why wininet.dll!InternetGetConnectedState() would be called at that point in the code.

I suspect that there is something in one of the C or C++ libraries that is putting in an atexit call for something that the C# has already cleared. I have tried forcing garbage collection to occur earlier in the C# code, but this does not cause an access violation.

So the questions are

  1. What could be causing this access violation on program exit, and how can I debug it?
  2. Why are only the windows calls seen in the call stack from procdump when all of my symbols are reported to be loaded?
drfred
  • 1

1 Answers1

0

I was experiencing exactly the same issue, and for a use case similar to the one mentioned in the original question.

Before providing a tentative answer and solution to the problem, allow me to provide more insights into the problem:

The common patterns between my use case and the originally posted one are:

  1. SWIG (C++ to C#).
  2. Concurrent code - multi-threaded and/or multi-process.
    • Although in my case, I was clearly seeing all threads, and child processes, joined properly.
    • Any shutdown / cleanup code that existed in any of the Worker Threads (which in this case were allocated in a Thread Pool), or in the parent process (acting as coordinator of child processes) was seen as getting executed correctly, and before the C# Main() returned.
    • Therefore, it did not seem to be a race condition in my case.
    • Nor threads (or child processes) left un-joined and completing their execution after C# Main() returned.
  3. (Probably) Un-managed / native C++ BSD Socket related code, leveraging Microsoft Winsock API.

Interestingly enough, in my case, the crash was manifesting always in .NET Framework (versions 3.5 up to 4.8), and NOT manifesting in .NET Core and .NET 5 and newer.

My crash stack trace looked exactly like the one referenced in the original question:

iertutil.dll!_purecall()
iertutil.dll!SettingStore::CKeyCache::OpenKey()
iertutil.dll!GetValue_Internal()
wininet.dll!ChangeGlobalSettings()
wininet.dll!FixProxySettings()
wininet.dll!InternetGetConnectedStateExW()
wininet.dll!InternetGetConnectedState()

The other interesting thing is that the 100% identical C++ code (implemented in a DLL, and which the .NET app is calling into, via SWIG and P/Invoke) executed via a C++ .exe app, did not show any crash.

So, this was happening only with C# app builds targeting .NET Framework versions up to 4.8.

The following C# code has completely fixed my problem:

using System;
using System.Runtime.InteropServices;

internal static class App
{
    [DllImport("wininet.dll", SetLastError = true)]
    private static extern bool InternetGetConnectedState(out int lpdwFlags, int dwReserved);

    private static int Main(string[] args)
    {
        InternetGetConnectedState(out int _, 0);

        [... your Main() original code ...]
    }
}

While I cannot provide a completely rational technical explanation as to why the 1-liner call to InternetConnectedState() (via P/Invoke) works, I can only speculate the following:

  1. The crash is caused by a weird interaction between Winsock and .NET Framework CLR,
  2. Whereby Winsock is apparently employing WinINet API within its internal C++ implementation, through functions like InternetGetConnectedState(),
  3. Probably by Microsoft to speed-up the detection of TCP Socket connectivity issues.
  4. Such WinINet API apparently maintains some global / static state,
  5. With such state probably become un-initialized or corrupted when Winsock-driven BSD Socket operations are performed in the underlying native C++ code (which is made accessible to C# land through SWIG C# light wrapper classes and P/Invoke),
  6. And without the .NET managed layer NOT being made aware of such state.

So, the proposed fix is to make at least one call to InternetGetConnectedState() from within the C# Main() (through P/Invoke), as early as possible, which probably makes the WinINet API initializing its internal global state much earlier, from within C# / .NET managed layer, AND therefore maintaining a longer lifetime / scope of such global state.

It may be worth taking my solution for a spin, to see if it fixes your problem.

Vasile
  • 11
  • 3
  • I might read it wrong, but I believe this isn't an aswer, but a question disguised as an answer. If not could you try and make it more clear what the answer on the posters question is? – Diceble Jun 21 '23 at 18:25
  • Sorry, I've meant my original posting as a comment, not as an answer. Can other member switch it to being a comment? I'm too new to SO as contributor, and I can't seem to be able to add comments to the original question/problem. Thanks. – Vasile Jun 22 '23 at 19:40
  • This is too long to be a comment. So it would have to be summerized. Also once your reputation is over 50 you can comment on questions. I believe you can't convert answers to a comment. so your suggestion is not an option. I can only add the comment for you if you give me a summerization of max 600 charachters. – Diceble Jun 26 '23 at 14:51
  • I've managed to come up with a technical solution to the problem. At least it's fixing the other use case of mine, which was very similar to what the original poster has been experiencing. So, I've edited my original posting, to be framed now as a true answer instead of my (rather unintentional) comment-or-follow-up-question-disguised-as-answer. I hope it looks ok now. And thank you for your patience. – Vasile Jul 11 '23 at 17:15