1

I am writing a global hook to correct for triple head monitor window positioning on platforms such as the Matrox TripleHead2Go which so far works very well for 32bit programs, but now I need to build the 64bit version I need some assistance in translating my x86 opcodes for the wndproc thunk I install on each window class.

The thunk adds an extra argument to the wndproc call, which is the original wndproc address, so that then my wndproc handler can call it at the end.

#ifdef _WIN64 
  //TODO: figure out the WIN64 instructions
#else
  const unsigned char PatchTemplate[] =
  {
    0x90,                         // nop, will become int3 if debug = true
    0x58,                         // pop eax (get the return address)
    0x68, 0x00, 0x00, 0x00, 0x00, // push imm32, original wndproc address
    0x50,                         // push eax (restore the return address)
    0x68, 0x00, 0x00, 0x00, 0x00, // push imm32, our wndproc address
    0xC3                          // retn
  };

  #define PATCH_ORIG_OFFSET 3
  #define PATCH_NEW_OFFSET  9
#endif
Jester
  • 56,577
  • 4
  • 81
  • 125
Geoffrey
  • 10,843
  • 3
  • 33
  • 46

1 Answers1

4

In 64 bit mode, the first 4 arguments are passed in the registers rcx, rdx, r8 and r9. Nevertheless stack space is allocated for them anyway.

We'd need to know how many arguments you are passing so that the extra argument can be put in the proper place. If it's a standard wndproc it already got 4 arguments. Your 32 bit code inserts the new argument at the beginning, so I assume that's your C prototype and we must do the same in 64 bit mode too, where it would be easier to append the new argument at the end.

Furthermore, stack must be kept 16 byte aligned, and the calling convention mandates that the caller frees the arguments (no more stdcall in 64 bit mode). Of course the caller doesn't know about the extra argument so wouldn't restore the stack properly, so we must do that ourselves.

The code might look like this:

00000000 90                      nop                         ; nop, will become int3 if debug = true
00000001 4883EC28                sub rsp, 40                 ; allocate space for arguments
00000005 4C894C2420              mov [rsp + 32], r9          ; spill 4th arg to stack
0000000A 4D89C1                  mov r9, r8                  ; move 3rd arg
0000000D 4989D0                  mov r8, rdx                 ; move 2nd arg
00000010 4889CA                  mov rdx, rcx                ; move 1st arg
00000013 48B988776655443322-     mov rcx, 0x1122334455667788 ; old wndproc
0000001C 11
0000001D 48B888776655443322-     mov rax, 0x1122334455667788 ; new wndproc
00000026 11
00000027 FFD0                    call rax                    ; call new wndproc
00000029 4883C428                add rsp, 40                 ; restore stack
0000002D C3                      ret

Update: this should be the version that appends the old wndproc as 5th argument:

00000000 90                      nop                         ; nop, will become int3 if debug = true
00000001 4883EC28                sub rsp, 40                 ; allocate space for arguments
00000005 48B888776655443322-     mov rax, 0x1122334455667788 ; old wndproc
0000000E 11
0000000F 4889442420              mov [rsp + 32], rax         ; add as 5th argument
00000014 48B888776655443322-     mov rax, 0x1122334455667788 ; new wndproc
0000001D 11
0000001E FFD0                    call rax                    ; call new wndproc
00000020 4883C428                add rsp, 40                 ; restore stack
00000024 C3                      ret
Jester
  • 56,577
  • 4
  • 81
  • 125
  • Excellent, Thank you!. You have assumed correctly, it is the winproc prototype with the added argument, and only the one additional argument. It is no problem to move the argument to the end for the x64 platform. I am happy to accept your answer, but if you would not mind can you please provide the code for appending the argument instead. – Geoffrey Jun 11 '14 at 15:42
  • 1
    Done. Please report if you find some problem so we can fix it. – Jester Jun 11 '14 at 15:49
  • Thanks, I will do. One question I have is why are you allocating 40 bytes on the stack? I understand the alignment requirement, but don't quite understand why 40 bytes are needed. – Geoffrey Jun 11 '14 at 15:52
  • As I said, even though the first 4 arguments are passed in registers, you still need to allocate spill space for them. We have 5 arguments each 8 bytes, so 40 bytes total. Luckily, adding the 8 bytes of the return address that will be pushed by the `call` will maintain the 16 byte alignment. – Jester Jun 11 '14 at 16:08
  • 1
    Ah, makes sense now.. I am fairly fuzzy with assembler still, been teaching myself it on and off for a while now. – Geoffrey Jun 11 '14 at 16:23
  • Awesome! Worked without a single change, 64 bit processes now behave properly also :). Thanks again. – Geoffrey Jun 11 '14 at 18:35