7

I have written a short program to read a windows obj file and find the .text section and run the code in it. To do this I make the following Windows API function calls (Full code [gist.github.com], for those interested):

HANDLE FileHandle = CreateFile("lib.obj",
                               GENERIC_READ | GENERIC_EXECUTE,
                               FILE_SHARE_READ, 0,
                               OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);

HANDLE MappingHandle = CreateFileMapping(FileHandle, 0, PAGE_EXECUTE_READ, 0, 0, 0);

void *Address = MapViewOfFile(MappingHandle, FILE_MAP_EXECUTE | FILE_MAP_READ,
                              0, 0, 0);

I then find the .text section in the file and cast the pointer to the code to a function pointer in C++ and simply call the function. This actually appeared to work for me.

Have I made a mistake not calling FlushInstructonCache on the range of virtual memory mapped to the file?

I ask this because I was recently reading the VirtualAlloc documentation and it notes at the bottom:

When creating a region that will be executable, the calling program bears responsibility for ensuring cache coherency via an appropriate call to FlushInstructionCache once the code has been set in place. Otherwise attempts to execute code out of the newly executable region may produce unpredictable results.

Is it possible that my code will cause the CPU to execute old instructions in the instruction cache?

There is no such note on the MapViewOfFile or CreateFileMapping pages.

Fsmv
  • 1,106
  • 2
  • 12
  • 28
  • 1
    I believe you have to call FlushInstructionCache after writing to the memory area and before executing it. Since you're not writing to it, this is irrelevant for you. – user253751 Feb 27 '17 at 06:40
  • I found this [comp.lang.asm.x86](http://comp.lang.asm.x86.narkive.com/btXRQXMH/why-do-we-need-flushinstructioncache) page. It seems to suggest that these instruction cache issues may be platform dependent. – Fsmv Feb 27 '17 at 07:06

1 Answers1

10

If you only load the file-content into memory using MapViewOfFile, it should be fine without.

If you MODIFY the content in memory, you need to flush the instructioncache before executing the code, as it MAY exist in cache in the unmodified form, and MAY then be executed without your modifications.

I use the word MAY because of two things:

  1. it depends on the processor architecture whether the processor detects writes to the memory it is about to execute [some processors don't even have hardware to register writes to data that is in instruction caches - because it's so rare that it's very unlikely].

  2. because it's hard to predict what may be in a cache - processors have all manner of "clever" ways to prefetch and in general "fill" caches.

Obviously, VirtualAlloc has zero chance of containing the data you wanted, so it's mentioned there because you'd ALWAYS write to it before executing.

Modifications include "fix up for absolute addresses" for example (something you'd have to do if you want to complete a project that loads something complex to execute it), or if you write a debugger, when you set a breakpoint by replacing an instruction with the INT 3 instruction on x86.

A second case of "modification" is if you unload the file, and load a different file (perhaps the "same" file, but rebuilt, for example), in which case, the previously executed code may still be in the cache, and you get the mysterious "why didn't my changes do what I expect"

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • That's an interesting point about debuggers, I had/have no idea how they work. – Fsmv Feb 27 '17 at 07:34
  • 4
    It was actually when I implemented a debugger for a small (in-house) OS that I encountered this for the first time - some 20 or so years ago! It wouldn't stop on my breakpoint - until I flushed the I-cache! :) – Mats Petersson Feb 27 '17 at 07:43
  • 1
    I don't understand, why writing to memory as a result of `MapViewOfFile` is different from explicitly writing to memory that was allocated using `VirtualAlloc`. Why would calling `FlushInstructionCache()` be required in the latter, yet not in the former? And how does freshly `VirtualAlloc`'ed memory wind up in a CPU's instruction cache anyway? – IInspectable Feb 27 '17 at 14:09
  • @IInspectable: the memory manager must be responsible for flushing the cache whenever a page from an executable module is read into memory for the first time. Theoretically the memory manager could behave differently for a file mapping created with `MapViewOfFile` than for a file mapping created by the Windows loader, but that seems unlikely in practice. As for `VirtualAlloc`, if your program writes to the newly allocated memory and then jumps directly to it, the CPU might have looked ahead and started caching the code at the other end of the jump before it has been written. – Harry Johnston Feb 27 '17 at 22:47
  • @HarryJohnston: *From your argumentation* it only follows that using `FlushInstructionCache` is unnecessary for file views from a section created with the `SEC_IMAGE` flag. I'm not saying that it is necessary otherwise, but the *argument* that the same thing happens when the Windows loader loads executables probably happens when the user calles `MapViewOfFile` and thus using such views is safe and doesn't require `FlushInstructionCache` applies only to sections created with the `SEC_IMAGE` flag. There would be no reason for Windows to flush the instruction cache for other sections. – conio Mar 02 '17 at 02:38
  • @conio, fair enough, my statement was stronger than it should have been. My intent was to explain why `MapViewOfFile` *might* reasonably behave differently to `VirtualAlloc` but the way I put it makes it sound as if it *must* behave differently. FWIW, I think an argument could be made that it *should* behave that way, for compatibility with Windows 95/98/ME (which I believe didn't support `SEC_IMAGE`) but that's still not conclusive. – Harry Johnston Mar 02 '17 at 03:05
  • Even for x86 with coherent I-cache, Raymond Chen explains: [If FlushInstructionCache doesn’t do anything, why do you have to call it, revisited](https://devblogs.microsoft.com/oldnewthing/20190902-00/?p=102828). You need the compiler to avoid reordering a store after a call through a function pointer. Unlikely in practice for a memory-mapped file buffer, though (not a private local array). Same reason applies to GCC `__builtin___clear_cache()` - [this](https://stackoverflow.com/posts/comments/85964322) shows a real example of code that manages to break without it. – Peter Cordes May 22 '21 at 18:49