8

I recently asked about repairing a corrupt event log, because it seemed to be a one-off event. The event log has since exhibited the same behavior 3 times. We have been trying to find patterns, but so far we have found nothing. The server runs several ASP.NET applications and three scheduled tasks written in .NET. The last modified date of the event log once happened to be the same time as one of the scheduled tasks, but the others have not been.

Any suggestions of where to look next or a way we can get any information out of a corrupt evtx file?

The server is running critical e-commerce applications, so we want to keep the number of restarts required to a minimum.

Edit: I ran DUMPEL and got very strange results.

1/9/2012    4:14:05 PM  1   100 1000    Application Error       N/A SERVERNAME  Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7a5f8  Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58  Exception code: 0xc0000374  Fault offset: 0x000ce653  Faulting process id: 0x1070  Faulting application start time: 0x01cccf1386d30991  Faulting application path: C:\Windows\SysWOW64\inetsrv\w3wp.exe  Faulting module path: C:\Windows\SysWOW64\ntdll.dll  Report Id: dbf4f691-3b06-11e1-9025-005056a602e6  
1/9/2012    4:14:07 PM  4   0   1001    Windows Error Reporting N/A SERVERNAME  Fault bucket , type 0  Event Name: APPCRASH  Response: Not available  Cab Id: 0    Problem signature:  P1: w3wp.exe  P2: 7.5.7601.17514  P3: 4ce7a5f8  P4: StackHash_79d9  P5: 6.1.7601.17514  P6: 4ce7ba58  P7: c0000374  P8: 000ce653  P9:   P10:     Attached files:  C:\Windows\Temp\WER975.tmp.appcompat.txt  C:\Windows\Temp\WERA03.tmp.WERInternalMetadata.xml  C:\Windows\Temp\WERA13.tmp.hdmp  C:\Windows\Temp\WERD21.tmp.mdmp    These files may be available here:  C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_cd7d09dfc84119d82a2ac6a789038bd5661acfb_cab_128f0e67    Analysis symbol:   Rechecking for solution: 0  Report Id: dbf4f691-3b06-11e1-9025-005056a602e6  Report Status: 4  
1/9/2012    4:14:07 PM  4   0   1001    Windows Error Reporting N/A SERVERNAME  Fault bucket , type 0  Event Name: APPCRASH  Response: Not available  Cab Id: 0    Problem signature:  P1: w3wp.exe  P2: 7.5.7601.17514  P3: 4ce7a5f8  P4: StackHash_79d9  P5: 6.1.7601.17514  P6: 4ce7ba58  P7: c0000374  P8: 000ce653  P9:   P10:     Attached files:  C:\Windows\Temp\WER975.tmp.appcompat.txt  C:\Windows\Temp\WERA03.tmp.WERInternalMetadata.xml  C:\Windows\Temp\WERA13.tmp.hdmp  C:\Windows\Temp\WERD21.tmp.mdmp    These files may be available here:  C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_cd7d09dfc84119d82a2ac6a789038bd5661acfb_cab_128f0e67    Analysis symbol:   Rechecking for solution: 0  Report Id: dbf4f691-3b06-11e1-9025-005056a602e6  Report Status: 0  
1/9/2012    4:14:12 PM  1   100 1000    Application Error       N/A SERVERNAME  Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7a5f8  Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58  Exception code: 0xc0000374  Fault offset: 0x000ce653  Faulting process id: 0x16ac  Faulting application start time: 0x01cccf139f475c0c  Faulting application path: C:\Windows\SysWOW64\inetsrv\w3wp.exe  Faulting module path: C:\Windows\SysWOW64\ntdll.dll  Report Id: e03bae70-3b06-11e1-9025-005056a602e6  
1/9/2012    4:14:16 PM  4   0   1001    Windows Error Reporting N/A SERVERNAME  Fault bucket , type 0  Event Name: APPCRASH  Response: Not available  Cab Id: 0    Problem signature:  P1: w3wp.exe  P2: 7.5.7601.17514  P3: 4ce7a5f8  P4: StackHash_9c6c  P5: 6.1.7601.17514  P6: 4ce7ba58  P7: c0000374  P8: 000ce653  P9:   P10:     Attached files:  C:\Windows\Temp\WER2579.tmp.appcompat.txt  C:\Windows\Temp\WER25F7.tmp.WERInternalMetadata.xml  C:\Windows\Temp\WER25F8.tmp.hdmp  C:\Windows\Temp\WER28F6.tmp.mdmp    These files may be available here:  C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_c49a67649524ad11b64bbf809211bc5ba742a3d6_cab_0b63321b    Analysis symbol:   Rechecking for solution: 0  Report Id: e03bae70-3b06-11e1-9025-005056a602e6  Report Status: 4  
1/9/2012    4:14:16 PM  4   0   1001    Windows Error Reporting N/A SERVERNAME  Fault bucket , type 0  Event Name: APPCRASH  Response: Not available  Cab Id: 0    Problem signature:  P1: w3wp.exe  P2: 7.5.7601.17514  P3: 4ce7a5f8  P4: StackHash_9c6c  P5: 6.1.7601.17514  P6: 4ce7ba58  P7: c0000374  P8: 000ce653  P9:   P10:     Attached files:  C:\Windows\Temp\WER2579.tmp.appcompat.txt  C:\Windows\Temp\WER25F7.tmp.WERInternalMetadata.xml  C:\Windows\Temp\WER25F8.tmp.hdmp  C:\Windows\Temp\WER28F6.tmp.mdmp    These files may be available here:  C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_c49a67649524ad11b64bbf809211bc5ba742a3d6_cab_0b63321b    Analysis symbol:   Rechecking for solution: 0  Report Id: e03bae70-3b06-11e1-9025-005056a602e6  Report Status: 0  
1/9/2012    4:14:21 PM  1   100 1000    Application Error       N/A SERVERNAME  Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7a5f8  Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58  Exception code: 0xc0000374  Fault offset: 0x000ce653  Faulting process id: 0x17f8  Faulting application start time: 0x01cccf13a4ba5126  Faulting application path: C:\Windows\SysWOW64\inetsrv\w3wp.exe  Faulting module path: C:\Windows\SysWOW64\ntdll.dll  Report Id: e57a0a85-3b06-11e1-9025-005056a602e6  
1/9/2012    4:14:21 PM  4   0   1001    Windows Error Reporting N/A SERVERNAME  Fault bucket , type 0  Event Name: APPCRASH  Response: Not available  Cab Id: 0    Problem signature:  P1: w3wp.exe  P2: 7.5.7601.17514  P3: 4ce7a5f8  P4: StackHash_9c6c  P5: 6.1.7601.17514  P6: 4ce7ba58  P7: c0000374  P8: 000ce653  P9:   P10:     Attached files:    These files may be available here:  C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_c49a67649524ad11b64bbf809211bc5ba742a3d6_1cfb4872    Analysis symbol:   Rechecking for solution: 0  Report Id: e57a0a85-3b06-11e1-9025-005056a602e6  Report Status: 4  
1/9/2012    4:14:21 PM  4   0   1001    Windows Error Reporting N/A SERVERNAME  Fault bucket , type 0  Event Name: APPCRASH  Response: Not available  Cab Id: 0    Problem signature:  P1: w3wp.exe  P2: 7.5.7601.17514  P3: 4ce7a5f8  P4: StackHash_9c6c  P5: 6.1.7601.17514  P6: 4ce7ba58  P7: c0000374  P8: 000ce653  P9:   P10:     Attached files:    These files may be available here:  C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_w3wp.exe_c49a67649524ad11b64bbf809211bc5ba742a3d6_1cfb4872    Analysis symbol:   Rechecking for solution: 0  Report Id: e57a0a85-3b06-11e1-9025-005056a602e6  Report Status: 0  

None of the files referenced actually exist (not even in WER ReportArchive). These should not be the only events mentioned. The log file has been cleared twice since January 9, so those events should not even be listed at all.

Update (2016-06-14):
We don't have this server anymore and therefore can no longer test proposed solutions. We never found out what was wrong, but we moved all our services onto new servers since then.

yakatz
  • 1,213
  • 4
  • 12
  • 35
  • My first step would be to try and replicate this in a non prod environment. Can you set up another server with the same apps and see if it re-occurs, or setup a VM copy? – Sam Cogan Jan 17 '12 at 22:35
  • @Sam I am trying to scrounge up the necessary resources for that. – yakatz Jan 17 '12 at 22:38
  • did you find a solution? please answer your own quesiton. Thanks – Leandro Bardelli Oct 10 '12 at 17:16
  • 2
    @Leandro we did not find a solution, but it seems to have stopped happening recently on its own. – yakatz Oct 10 '12 at 21:11
  • Has the code in the app pool changed at all since this originally occurred? The dumpel output suggests that one of your app pools was crashing and error reporting was checking with MS for the status on that particular crash. I'd guess there was an uncaught exception in the code that was crashing the app pool and that it has been fixed. – Nathan V Nov 27 '12 at 09:31
  • @NathanV This has nothing to do with the events shown in DUMPEL. The problems in the application were fixed long before this question was posted on January 17. My problem (still) is that the Event Viewer does not update properly and DUMPEL on January 18 gave results that did not make any sense. – yakatz Nov 27 '12 at 15:02
  • @yakatz Ah, I see. Misread the intent. :) As for the corrupted log, then; that's still ongoing? – Nathan V Nov 27 '12 at 15:09
  • Still ongoing, although not getting corrupt quite as often as it was. No major changes that might explain why. – yakatz Nov 27 '12 at 17:53
  • Do you have an identical setup running w/same load (either in development or a load-balanced setup) yet? Have you checked IDS, Security, and System logs - and possibly IIS logs? Is it just App logs getting corrupted? And how close do CPU and RAM usage get to available limits? – Lizz Jan 05 '13 at 05:42
  • You get events in the event log when they are added to the event log, not when they happen. Crash events, and event log crash events, are NEVER added when they happen, its not possible, they are ALWAYS added when you query the crash dump log, whatever that is. They ALWAYS show up days or months later when you reboot or do something else that triggers the add-to-event-log activity. – user165568 Mar 29 '13 at 14:01
  • Generally this corruption is the result of something writing events. I would use "event log explorer" to open up the raw evt as it's worked for me with corrupted event logs in the past. Check to see which non-windows providers are writing events. – ThatOneDude Jul 20 '15 at 04:38

2 Answers2

1

Surprised this hasn't been mentioned before; Have you verified the filesystem? If it's a local disk, and you can suck-up the downtime, flag the volume for a chkdsk and reboot. Do a surface scan if at all possible.

Note that this will be very time-consuming. Especially on a large (+50gb) volume. Shoot for a weekend if at all possible.

Signal15
  • 952
  • 7
  • 29
  • It is actually a VM, so we don't have access to the physical disk. – yakatz Oct 03 '13 at 22:01
  • 4
    The fact it's a VM is irrelevant. You may have filesystem corruption. Run a 'chkdsk' during your next downtime/maintenance interval. – Signal15 Oct 04 '13 at 20:39
0

Sounds like you may have a problem with filesystem corruption - a good way to check into this without having to reboot is to run:

sfc /scannow

And see if you get a multitude of corrections or errors. If you do, the best next step is to reboot to run a chkdsk to repair your partitions and correct any errors in them. After that, if you're still having issues you may need to talk to your provider about the underlying hardware.

rtw
  • 16
  • 2
  • This virtual server doesn't exist anymore so I can't test anything new with this question, but I know the filesystem was OK and we had run `sfc` before and had not gotten any errors. – yakatz Jun 14 '16 at 19:54