1

Occasionally when deploying my app* a seemingly random one of the bundled DLLs gets in a strange state. On the instance I'm debugging now it's System.Net.Security.dll (from .NET Core 3.1), but it could be any.

  • In Explorer, the DLL shows no version information (version, product name, copyright, etc).
  • The .NET app fails to load the library, yielding a BadImageFormatException: assembly manifest not found.
  • Resource Hacker does show the version info block.
  • When I make a copy of the DLL, the copy works fine. The original and copy are bit-for-bit identical.
  • When I make a disk image of the disk and attach it elsewhere, the problematic file is fine.
  • When I call GetVersionInfoSizeEx() on the DLL, it fails with error 1813 (ERROR_RESOURCE_TYPE_NOT_FOUND). It works fine on the bit-for-bit identical copy.

Screenshot

Since there clearly is nothing wrong with the bits in the file, I'm inclined to suspect corruption in sort of cache or in-memory data structure.

How come the version info cannot be read for this DLL, while it can be read just fine from an identical copy?


I did some digging and found that the resource APIs call LoadLibraryEx() with LOAD_LIBRARY_AS_DATAFILE | LOAD_LIBRARY_AS_IMAGE_RESOURCE. After some hoops we end up in ntdll's private LdrpResGetResourceDirectory() which does some calculations and looks up a field at an offset inside the module:

00007ffd`c4c17f61 0fb74a0e        movzx   ecx,word ptr [rdx+0Eh]
00007ffd`c4c17f80 6685c9          test    cx,cx
00007ffd`c4c17f83 7463            je      ntdll!LdrpResGetResourceDirectory+0x408 (00007ffd`c4c17fe8)

rdx is a pointer to somewhere inside the module. From my reading:

rdx = handle + 0x100 + somestruct.(0x14) - somestruct.(0x0C)

This is where the execution diverges between the problematic original DLL and the bit-for-bit identical copy: cx is 0 for the bad original but 1 for the good copy.

(I've tried calling LoadLibrary() on the DLL and then dumping the memory to compare it against the original DLL, but I can't make out the forest for the trees in the resulting diff.)


*) A self-contained .NET Core 3.1 application that's deployed to Windows Server 2019 Azure Batch nodes by way of downloading and extracting a zip, then robocopying it to shared/.

Sijmen Mulder
  • 5,767
  • 3
  • 22
  • 33
  • 1
    A quick debug tells me that `rdx` is pointing to the [`IMAGE_RESOURCE_DIRECTORY`](https://doxygen.reactos.org/dd/d43/pedump_8c_source.html#l00382) of the PE file, thus +0x0e is pointing on the `NumberOfIdEntries` field (confirmed by setting a BP on the function and looking at the pointer). I'm still puzzled as to why in your case, for two identical copies, it's different. – Neitsa Jan 07 '21 at 11:07
  • Thanks @Neitsa. Windbg (classic) didn't show me this info, maybe I need to set up symbol servers or such. – Sijmen Mulder Jan 07 '21 at 11:15
  • look for flags in call - *Loads the version resource strings from the **corresponding MUI file**, if available* - so actually try load from another file (MUI) (located by name) – RbMm Jan 07 '21 at 11:35
  • Already checked that, @RbMm, it occurs with all combinations of the `FILE_VER_*` flags (and succeeds on the copy, with all combinations). Also occurs when the filenames are swapped. – Sijmen Mulder Jan 07 '21 at 11:38
  • so really it looked in mui file, not in dll. this is explain why on copy (with **another name** ) is ok – RbMm Jan 07 '21 at 11:39
  • I see that point, but when I rename the original to System.Net.Security-orig.dll and the copy to System.Net.Security.dll, it still fails on -orig and works with the (now named after the original) copy. – Sijmen Mulder Jan 07 '21 at 11:41
  • interesting, but need look under debugger on system where this – RbMm Jan 07 '21 at 11:42
  • I'm still looking for a way to reproduce this outside of the Azure Batch environment, so far I can only repro it by spinning up a bunch of nodes and hoping one fails. – Sijmen Mulder Jan 07 '21 at 11:44
  • will be more easy, if look on call tree and compare (fail and ok tree) - like this - https://i.imgur.com/FncBpHF.png – RbMm Jan 07 '21 at 11:50
  • Can do. What tool is this? I'm using Windbg (classic, no Store on this one) but I've never used it before. – Sijmen Mulder Jan 07 '21 at 11:55
  • for debug this need remote access to machine, where this error :) – RbMm Jan 07 '21 at 11:59
  • Ah yes that's another issue, it's behind an Azure Batch access gateway. Hence trying to reproduce it elsewhere. – Sijmen Mulder Jan 07 '21 at 12:02

0 Answers0