0

I'm currently trying to learn MASM x64, and so far I seem to be getting the hang of things pretty well. Everything was going well right up until I tried to call CreateFileW to read the contents of a .txt file. The problematic code is as follows:

    ; Open the file for GENERIC_READ
    CALL ClearRegisters
    LEA RCX, TextTestfilePath
    MOV RDX, 80000000h  ; GENERIC_READ
    MOV R8, 00000001h   ; FILE_SHARE_READ
    MOV R9, 0h          ; NULL
    SUB RSP, 40h
    PUSH 0h             ; NULL
    PUSH 80h            ; FILE_ATTRIBUTE_NORMAL
    PUSH 3              ; OPEN_EXISTING
    CALL CreateFileW
    ADD RSP, 40h
    CMP EAX, -1
    JNE p_skip_invalid_create_file
    CALL InternalError
p_skip_invalid_create_file::

This is a subset of the full program which can be found here.

When I run the program, I will type in my file ("test.txt") into the program (test.txt is located within the source files, which can also be found on the GitHub). TextTestfilePath is the stored value of that ReadConsoleW output (with the CRLF truncated off of the end). In memory, it reads as 0074 0065 0073 0074 002e 0074 0078 0074 0000 or ".t.e.s.t...e.x.e..", which to my understanding is valid Unicode.

When executing the code, CreateFileW returns -1 or INVALID_HANDLE_VALUE, and after the call to GetLastError is when I receive 0x57 or ERROR_INVALID_PARAMETER. I have tried calling SetLastError to set it to zero before the call and receive the same response.

After quite a bit of conversation with GPT-4, I still can't seem to find the source of the issue. I have verified the following:

  • Each of the parameters is correct.
  • The TextTestfilePath is correctly written to in memory (and succeeded in an earlier call to GetFileAttributesW)
  • The RDX, R8, and the first stack (final push) parameters are correct to the best of my understanding.
  • I believe I am correctly allocating the shadow space needed for the call to succeed.

I am still learning MASM x64 with the limited information there is about it out there, but I have a general understanding of how it all works, and I've read a few books on it and used a portion of the Win32 Console API up to this point.

But, every time I get this parameter error, I get to be at a complete loss. It's so vague that I don't know where to really check, and the things I do check all never seem to be the issue. So if anyone has any idea of more things I'd need to check (or heck, if you see the issue) (or heck heck, if you have any tips for me that I have yet to figure out), please let me know! :)

Before commenting that I am doing something wrong, please help not only me but the community find well-documented sources to learn MASM x64 that can explain that concept well! Just saying, "You're doing something wrong, and you need to fix it," neither helps resolve this issue nor contributes to a discussion that encourages learning and education, which would be expected from a site like StackOverflow. Links to third-party sources, in addition to the obvious Microsoft docs, are incredibly helpful for a big-picture overview of what is expected, instead of assuming certain things are known when they may not be.

FireController1847
  • 1,458
  • 1
  • 11
  • 26
  • 1
    your parameters in wrong place on stack. the 5-th must be at [rsp+20h], 6 - [rsp+28h] and so on. but you use `push` what is wrong – RbMm Jun 17 '23 at 10:42
  • @RbMm I guess I should've noted, I also tried SUB RSP, 20h; SUB RSP, 50h, and various other stack positions to no avail. I have not found any good resources to teach me the conventions for MASM x64 programming, so I've been piecing together what I can where I can. I was unfamiliar with the syntax MOV [RSP+20h], value. Though, the Windows documentation has little to say on the exact stack positions of the values, at least from what I've been able to find. I'm sure I missed something somewhere. If you have any resources that could point me in the right direction, that'd be incredibly helpful! – FireController1847 Jun 17 '23 at 10:45
  • https://learn.microsoft.com/en-us/cpp/build/stack-usage?view=msvc-170 – RbMm Jun 17 '23 at 10:49
  • `CMP EAX, -1` also formal not correct, despite always will be work ok. must be `CMP RAX, -1`. instead `CALL CreateFileW` better user `CALL __imp_CreateFileW` – RbMm Jun 17 '23 at 10:53
  • @RbMm I appreciate the insight. I was unfamiliar with the \_\_imp\_ notation for access, which is why I have been manually defining everything. I use PROTO in the code. I've read that stack page many times, but it must be going over my head because I have no clue how what's mentioned there translates to MOV [RSP+20h], VAL. The best of what it mentions is to "push arguments onto the stack," repeated three times, with no mention of what that offset is or how far it needs to go. From what I've learned through ChatGPT, it's 20h shadow for first 4, +8 for every stack arg, aligned to the nearest 16 – FireController1847 Jun 17 '23 at 10:57
  • STRONG SUGGESTION: 1) Write a C program that calls the "CreateFile()" Win32 API. 2) Disassemble: `cl /FAs mytest.c` – paulsm4 Jun 18 '23 at 22:02
  • I appreciate the insight @paulsm4! I found a series which does a similar thing with their example of an add function, would you say this is similar to what results from that? https://medium.com/@sruthk/cracking-assembly-stack-frame-layout-in-x64-75eb862dde08 – FireController1847 Jun 18 '23 at 22:03

1 Answers1

1

I've figured it out. I was indeed pushing onto the stack wrong. But I had a fundamental misunderstanding of how the stack worked which the Microsoft docs did a horrible job of explaining.

What I Did Wrong

Attempt #1

As @RbMm pointed out in the comments, the arguments are expected to be on RSP+20h, RSP+28h, and RSP+30h respectively. In addition, there needs to be the shadow space on the stack for the function call. I was making a series of mistakes which caused this not to work.

Let's explain the way I did the code previously:

LEA RCX, TextTestfilePath
MOV RDX, 80000000h      ; GENERIC_READ
MOV R8, 00000001h       ; FILE_SHARE_READ
MOV R9, 0h              ; NULL
SUB RSP, 20h
PUSH 00h
PUSH 80h
PUSH 3
CALL CreateFileW
ADD RSP, 20h
  1. I was modifying the stack pointer to push the shadow space. This is correctly, and 20h is the correct value for this because it is 32 bytes of shadow space which translates to 20h in hexadecimal. This will keep everything 16-bit aligned.

  2. I was pushing the arguments onto the stack. The problem is, I was doing this incorrectly (or backwards). The RSP, or stack pointer, references the top of the stack. When I PUSHed the values onto the stack, it would push the values higher onto the stack. To top this off, it would modify the stack pointer so that it is no longer 16-bit aligned. The stack pointer is expected to be at 20h or 40h respectively, and not modified via a PUSH call.

  3. After having pushed, with the values in the wrong position and the pointer in the wrong spot, the call would fail entirely.

Attempt #2

So, I attempted to correct for these mistakes by doing the following. However, I made a fatal mistake again in this process:

LEA RCX, TextTestfilePath
MOV RDX, 80000000h      ; GENERIC_READ
MOV R8, 00000001h       ; FILE_SHARE_READ
MOV R9, 0h              ; NULL
MOV [RSP + 20h], 3
MOV [RSP + 28h], 80h
MOV [RSP + 36h], 00h
SUB RSP, 20h
CALL CreateFileW
ADD RSP, 20h

There's two major mistakes here, and this one should be more obvious.

  1. I was pushing the values onto the top of the stack. However, by doing this, it completely overrides our shadow space with the three arguments. Then I would move the stack pointer, taking it completely away from the arguments I just pushed.

  2. In 20h, 28h, and 36h, I was doing math wrong. I was adding 8 in decimal (20+8=28, 28+8=36), however, I should've been adding 8 in hexadecimal (20h+8h=28h, but 28h+8h != 36h, but 30h).

  3. The assembler does not handle [RSP+28h] correctly. Instead, it was important I specified the size of value I was moving and calling the pointer. Thus, I needed to add QWORD PTR before it. (Notably, I am on x64, so I used QWORD instead of DWORD, as almost all of the MASM examples out there try and say is correct).

Attempt #3

After I resolved these problems, my code resulted in the following:

LEA RCX, TextTestfilePath
MOV RDX, 80000000h      ; GENERIC_READ
MOV R8, 00000001h       ; FILE_SHARE_READ
MOV R9, 0h              ; NULL
SUB RSP, 20h
MOV QWORD PTR [RSP + 20h], 3
MOV QWORD PTR [RSP + 28h], 80h
MOV QWORD PTR [RSP + 30h], 00h
CALL CreateFileW
ADD RSP, 20h

This code does the following:

  1. It moves the first four arguments into the registers, as before.

  2. It moves the stack pointer (which, as explained before, it is top of the stack) 20h, which aligns it via 16 byte alignment for 32 bytes of shadow space. Important to note is that this, in and of itself, does not create the shadow space. While it does open 32 bytes of space, it's important we don't override the 32 bytes we just opened up. Your arguments do not go in this space.)

  3. It puts the arguments in our newly modified stack pointer, but offsets them by 20h to avoid overriding the shadow space.

And yes, if you're seeing what I am seeing, this code is actually the same thing as doing this:

MOV QWORD PTR [RSP], 3
MOV QWORD PTR [RSP + 8h], 80h
MOV QWORD PTR [RSP + 10h], 00h
SUB RSP, 20h

This is doing the exact same thing, but it puts the arguments onto the stack before allowing the shadow space.

I prefer the syntax of +20h to account for the shadow space, as it makes it more obvious for me that we are taking it into account. But what I want you to get out of this, is that the documentation for the stack is terrible.

Attempt #4

As @RaymondChen pointed out in the comments, I was not taking into account the epilog and prolog for my function. RSP should not be modified (among a few other registers, that is, RBX, RBP, RDI, RSI, RSP, and R12 through R15) inside the body of a function. If they are modified, they must be preserved and restored prior to and following the function's call, respectively. This is the purpose of the epilog and prolog, alongside debugging when an exception occurs.

The updated function call does essentially the same thing as before, but does not modify the stack pointer:

LEA RCX, TextTestfilePath
MOV RDX, 80000000h      ; GENERIC_READ
MOV R8, 00000001h       ; FILE_SHARE_READ
MOV R9, 0h              ; NULL
MOV QWORD PTR [RSP + 20h], 3
MOV QWORD PTR [RSP + 28h], 80h
MOV QWORD PTR [RSP + 30h], 00h
CALL CreateFileW

I've updated the "standard" below.

The x64 Stack Usage Standard (in better terms)

Here is the actual x64 stack usage standard that you need to follow when calling a Win32 function in MASM x64:

  1. At the beginning of your function (including main), set up a prolog.
    • In this prolog is where you allocate the 20h of shadow space for function calls by subtracting 32 bytes or (20h) from the stack pointer, in addition to other local variables and stack arguments. An example is given below.
  2. Assign your first four arguments to RCX, RDX, R8, and R9 for ARG1, ARG2, ARG3, and ARG4 respectively.
  3. Push your remaining arguments onto the stack without modifying the stack pointer, and past the 20h of reserved space (that is, MOV QWORD PTR [RSP+20h], ARG5, MOV QWORD PTR [RSP+28h], ARG6, MOV QWORD PTR [RSP+30h], ARG7 and so on).
  4. CALL your Win32 method.
  5. At the end of your function and after all calls are completed (including main), set up an epilog.
    • In this epilog is where you restore the stack to the original pointer prior to the function call. You'll add the same value you subtracted at the beginning of the function.

An example of a proper Win32 function call is shown below:

INCLUDELIB kernel32.lib

.CODE
main PROC
    LOCAL LocalVariable: QWORD

    ; Prolog
    PUSH RBP        ; Store the RBP to restore it after
    MOV RBP, RSP    ; Move the RSP into RBP for debugging
    SUB RSP, 40h    ; 20h of shadow space for function calls
                    ; 8h for the one local QWORD variable
                    ; 18h for 3 stack arguments

    MOV RCX, ARG1   ; Put ARG1 into RCX
    MOV RDX, ARG2   ; Put ARG2 into RDX
    MOV R8, ARG3    ; Put ARG3 into R8
    MOV R9, ARG4    ; Put ARG4 into R9
    MOV QWORD PTR [RSP + 20h], ARG5    ; Put ARG5 into RSP+20h
    MOV QWORD PTR [RSP + 28h], ARG6    ; Put ARG6 into RSP+28h
    MOV QWORD PTR [RSP + 30h], ARG7    ; Put ARG7 into RSP+30h
    CALL MyWin32Function

    ; Technically, you don't need a prolog if your next
    ; call is going to end the process. I provide it
    ; for an example.

    ; Epilog
    ADD RSP, 40h    ; Same value as epilog
    MOV RSP, RBP    ; Restore original stack pointer
    POP RBP         ; Restore original RBP
    RET
main ENP
END

This ensures that when you store your arguments (in RSP+20h), it is still within your epilog and prolog (which is RSP to RSP+40h of space).

You must also perform this epilog and prolog methodology for any functions you may develop or create. This avoids needing to allocate the 20h of stack space every function call, and correctly handles Win32 exception handling for the __fastcall convention so that it (and you) can 'walk the stack.'

Hopefully this helps someone understand this a little better.


I am not sure why the standards express things in terms of right to left, or front to back, or top to bottom, because this explanation is unintuitive and subjective depending on how you are viewing the stack. Using terms like ADD or SUBTRACT makes much more sense and is universal no matter the way the stack is being displayed.

I hope that this helps someone avoid the 6-7 hours of research and pain that I went through, and helps explain the stack much better! If anyone has any comments regarding my explanation as to things I may have overlooked or explained incorrectly, please let me know. However, so far this has worked for me 100% of the time.

FireController1847
  • 1,458
  • 1
  • 11
  • 26
  • 2
    You should not modify rsp outside the prologue and epilogue. If you do, you will have to declare additional unwind codes so the system can walk the stack in case of an exception or signal. – Raymond Chen Jun 18 '23 at 03:01
  • @RaymondChen I am a little confused, but I am trying to understand the prolog and epilog. Am I supposed to be setting this up in my "main" function as well? As that's where this resides. Would you mind explaining a little more what I would need to do to resolve this issue in my code that I've posted? – FireController1847 Jun 18 '23 at 03:15
  • 3
    x64 code requires additional metadata known as "unwind codes", which are generated by special directives. [Details](https://learn.microsoft.com/cpp/build/exception-handling-x64). If you move the stack pointer outside of a prologue or epilogue, and you don't have a frame pointer, then you need to emit an unwind code to let the operating system know how to walk the stack in case of an exception or signal. The only cases I can think of offhand where you would need to move the stack pointer in the middle of a function are `alloca` and shrink-wrapping, neither of which applies to your function. – Raymond Chen Jun 18 '23 at 05:18
  • @RaymondChen After some research, I think I understand. Would you be willing to take a glance at my code and see if I've set up the epilogs and prologs correctly? https://github.com/FireController1847/masmtest/blob/98d3dff202ad8ad46538ad4d9576dc8c123af2e5/src/FileTest/filetest.asm – FireController1847 Jun 18 '23 at 21:19
  • I can't edit my comment anymore, but it may be better to link to master as I made some final changes as well. I've updated the answer to take into account the epilog and prolog, please let me know if there's anything else I missed. Thank you! https://github.com/FireController1847/masmtest/blob/master/src/FileTest/filetest.asm – FireController1847 Jun 18 '23 at 22:10
  • I still don't see any unwind codes. For example, after `push rbp`, there should be a `.pushreg rbp`. – Raymond Chen Jun 18 '23 at 23:23
  • @RaymondChen It's my understanding that directives like that only apply to FRAME functions, am I wrong? – FireController1847 Jun 18 '23 at 23:35
  • "[Every function that allocates stack space, calls other functions, saves nonvolatile registers, or uses exception handling must have a prolog whose address limits are described in the unwind data associated with the respective function table entry](https://learn.microsoft.com/en-us/cpp/build/prolog-and-epilog?view=msvc-170)." Note also that `mov rsp, rbp` is not a legal instruction in an epilogue. – Raymond Chen Jun 19 '23 at 00:04
  • @RaymondChen Do you know of an example program I can see for reference written for MASM x64 that does everything you're saying correctly? Because clearly I'm either not understanding, or the documentation is written poorly — nowhere have I read, found, or even seen in my research this portion of the x86 calling convention other than the single paragraph mentioned there. Nowhere I've found describes how I need to do these things. I do appreciate the insights but I'm tired of having to go on a wild goose chase to search for specifics, and then piece together what the actual answer might be. – FireController1847 Jun 19 '23 at 04:56
  • No professional programmer writes an entire x86-64 program in assembly language. I don't know of any examples. At best, maybe a short function or two are written in assembly. A complete example of a short function is given in the documentation I already linked to. – Raymond Chen Jun 19 '23 at 12:27
  • @RaymondChen I appreciate your feedback and opinion, and I respectfully disagree. This is not the site for a discussion on whether or not I should be doing this, so if you have no further educational resources or examples to share with me, I'd appreciate if we kept the discussion to the question at hand. Thanks! – FireController1847 Jun 19 '23 at 21:50
  • 1
    Upon rereading, I apologize for the tone of my remarks. I didn't mean "You have no business doing this." I meant "You won't find examples of this because nobody does it." The examples are of writing individual functions in assembly, because that's what people actually do. Now, you can extrapolate that to an entire program, since `main` is itself just a function, but I don't think you'll find a fully annotated end-to-end of "a program written entirely in assembly language that is production-ready" because nobody does that. – Raymond Chen Jun 19 '23 at 22:03
  • @RaymondChen I actually really appreciate the clarification! I apologize if mine came across disrespectful as well. I appreciate the feedback, honestly it's part of the reason I made the post. I want to contribute more to this realm due to a significant lack of fleshed-out documentation and knowledge there is about MASM x64 out there. I'll definitely be continuing my research to try and find any more information in regards to this subject, hopefully I can gather some additional clarification on the unwind process. I was frustrated the night before, but that's no excuse for how I responded. – FireController1847 Jun 19 '23 at 22:19