1

So my idea is to "lift" 64-bits Windows executable to LLVM bitcode (or whatever is higher than assembly) and then compile it back to 32-bit executable.

I found that RetDec and McSema can lift PE binary to LLVM IR (and optionally C), but McSema requires IDA pro so I haven't tried it yet.

I have installed MSVC v143 and Windows SDK version 10.0.19041.0:

vs

Clang version:

clang version 13.0.1 (https://github.com/llvm/llvm-project 75e33f71c2dae584b13a7d1186ae0a038ba98838)
Target: x86_64-pc-windows-msvc
Thread model: posix

So I compile this Hello World code in C using Clang:

#include <stdio.h>

int main()
{
        printf("Hello, world!\n");
}

then clang hello.c -o hello.exe

Check hello.exe file type with WSL:

$ file hello.exe
hello.exe: PE32+ executable (console) x86-64, for MS Windows

You can download it here.

Then I use RetDec to lift it to LLVM IR:

python retdec-decompiler.py --no-memory-limit hello.exe

Output: here

After that we get:

files

Compile bitcode back to executable:

clang hello.exe.bc -m32 -v -Wl,/SUBSYSTEM:CONSOLE -Wl,/errorlimit:0 -fuse-ld=lld -o hello.x86.exe

Output: here

I guess functions like _WriteConsoleW are Win32 APIs, but ___decompiler_undefined_function_0 might be generated from the decompiler by some way.

Also, the decompiled code has no main function, but it had entry_point function. From hello.exe.ll:

hello.exe.ll

hello.exe.c also has entry_point instead of main:

hello.exe.c

And also, hello.exe.c doesn't have ___decompiler_undefined_function_0

I also tried running the bitcode with lli:

lli --entry-function=entry_point hello.exe.bc

Output: here

Here is the link to the files.

How to make this compile? Thanks!

raspiduino
  • 601
  • 7
  • 16

1 Answers1

1

That's very ambitious.

I'm going to go out on a limb and say that every windows application includes thousands of system header files, most of which use types whose size differs between 32- and 64-bit systems and many of which contains #ifdef or other platform-dependent differences. You'll have a large .ll file full of windows64-specific types and code.

If the developers at Microsoft saw windows64 as a good chance to drop some hacks that were needed for w95 code, then you'll have w32-incompatible code there, too.

What you have to do is what the wine developers did — add code to cater to each problem in turn. There will be thousands of cases to handle. Some of it will be very difficult. When you see the number 128 in the .ll file, was it sizeof(this_w64_struct) in the original source, sizeof(that_other_struct) or something else entirely? Should you change the number, and if so, to what?

You should expect this project to take at least years, maybe a decade or more. Good luck.

arnt
  • 8,949
  • 5
  • 24
  • 32
  • Ok so I understand that the code RetDec generate is just for easy-to-read purpose, not to recompile back. I thing the best way to do this is either something like [box86](https://github.com/ptitSeb/box86) but for x86_64 -> x86, or something like [qemu user mode emulation](https://qemu-project.gitlab.io/qemu/user/index.html) but for nt instead of posix – raspiduino Jun 04 '22 at 08:51
  • The best way to do it is to do it. Step 1: Start on the job. Step 2: Stay on the job. – arnt Jun 04 '22 at 20:48