2

We've had an application written in C# running somewhere for a few years. After a few years, at random times the application crashed. It doesn't hang but just closed itself.

I started by adding a lot of logging through log4net to a textfile. This way i discovered the application crashed on invoking a method from another class.

    log.Info("Class OPC - groupRead_DataChanged() - " + "beelden van inputname " + inputname + " op cameo zetten");
try
{
    cameoControl1.Invoke(new Action(() => cameoControl1.AddCameras(CameraGrid)));
}
catch (Exception exc)
{
    log.Info("Class OPC - groupRead_DataChanged() - " + "cameoControl1.Invoke werkte niet. " + exc);
    log.Error("Class OPC - groupRead_DataChanged() - " + "cameoControl1.Invoke werkte niet. " + exc.Message);
    log.Error("Class OPC - groupRead_DataChanged() - " + "cameoControl1.Invoke werkte niet. " + exc.InnerException);
    log.Error("Class OPC - groupRead_DataChanged() - " + "cameoControl1.Invoke werkte niet. " + exc.Source);
    log.Error("Class OPC - groupRead_DataChanged() - " + "cameoControl1.Invoke werkte niet. " + exc.StackTrace);
    log.Error("Class OPC - groupRead_DataChanged() - " + "cameoControl1.Invoke werkte niet. " + exc.TargetSite);
}
log.Info("Class OPC - groupRead_DataChanged() - " + "beelden van inputname " + inputname + " staan nu op cameo 1");
}

The log files show "beelden van inputname..... op cameo zetten". But then, nothing... it doesn't even get to the logs in the catch.

I've set it up to create dump files on crash which gave me the following information:

Exception Code: 0xC0000005

Exception Information: The thread tried to read from or write to a virtual address for which it does not have appropriate access

After looking this up, I came to the conclusion that there are multiple causes possible: https://www.stellarinfo.com/blog/how-to-fix-error-code-0xc0000005-in-windows/

RAM could be defect, disk errors, ... We have 3 of the same PC's with a variant of this application. Only 1 has these issues. So I made an image of another one, restored it on the faulty PC and put the correct application on it. The issue remained.

So now I'm wondering, could it really be a hardware defect? How can defect RAM cause this kind of issues? Wouldn't the OS (Windows 10) be taking care of it?

RobbeM
  • 727
  • 7
  • 16
  • 36
  • 1
    I like how you systematically debugged the issue. How long did all of that take from you to gether all of this information? If you say the issue is only on one PC (though I don't think it is a HW issue), have you tried switching the hardware between the PCs? For example the RAM? This does not make too much sense to me, nevertheless I would also try that. If anything different comes to my mind I will let you know. – Mohammed Noureldin Jul 30 '21 at 07:13
  • It's over a period of about 2 months probably. The system is far away and it's hard to schedule to remote into it... So it's been a lot of emails back and forth... I haven't tried switching the hardware as I can't really imagine that being the issue. So that's why I came here, someone might be able to confirm that hardware actually could be the issue and why... – RobbeM Jul 30 '21 at 07:28
  • Another possibility might be that some unsafe or native code has an invalid access. See [related question](https://stackoverflow.com/questions/3352518/the-thread-tried-to-read-from-or-write-to-a-virtual-address-for-which-it-does-no). I would check if you are using any native library, and pay special attention to if it is used in a thread-safe manner. Thread safety issues might depend on the processor speed and number of cores, so I would not rule out a software issue. On the other hand, replacing a PC might be cheaper than investigating a difficult bug. – JonasH Jul 30 '21 at 07:54
  • @JonasH, I had to look up what unsafe and native code means. It needs a keyword in the code right? Then there's no unsafe or native code. The first line of code in cameoControl1.AddCameras() is also a log. And it doesn't even output that log. I just don't understand why it's working perfectly for 3 years on the same hardware and now suddenly these problems come up? – RobbeM Jul 30 '21 at 08:20
  • Start with a memory diagnostic and a `chkdsk /r` scan. Use CrystalDiskInfo to find out condition of hard drive. Then update all device drivers, and run full Windows updates. After that swap out every component one by one, starting with the PSU, graphics card (if any), RAM, hard drive, CPU and finally motherboard. – Charlieface Jul 30 '21 at 09:13
  • A reason I'm asking is because some types of cameras only have an native API (like directShow), or require third party libraries, and these are often implemented as a native API with a .Net wrapper. – JonasH Jul 30 '21 at 09:57
  • @JonasH, yes we also had to use an API. I didn't write the software and the one who did left the company. But the AddCameras() method is in our own part of the software. Where I've put some logging as first thing to do and the logging isn't executed. So that tells me it doesn't have anything to do with the camera API but it's all in our code... – RobbeM Jul 30 '21 at 10:34

1 Answers1

0

In my case, this silent crash (viewed in Minidump) was caused by Data Execution Prevention (DEP) and adding to exceptions removed error.

Reminder for random visitors - be careful, probably scan first on VirusTotal or similar site what you adding to exceptions.

halt9k
  • 527
  • 4
  • 13