2

I've run into some strange behaviour when trying to obtain the current stack pointer in C (using inline ASM). The code looks like:

#include <stdio.h>
class os {
  public:
    static void* current_stack_pointer();
};

void* os::current_stack_pointer() {
  register void *esp __asm__ ("rsp");
  return esp;
}

int main() {
  printf("%p\n", os::current_stack_pointer());
}

If I compile the code using the standard gcc options:

$ g++ test.cc -o test

It generates the following assembly:

__ZN2os21current_stack_pointerEv:
0000000000000000        pushq   %rbp
0000000000000001        movq    %rsp,%rbp
0000000000000004        movq    %rdi,0xf8(%rbp)
0000000000000008        movq    0xe0(%rbp),%rax
000000000000000c        movq    %rax,%rsp
000000000000000f        movq    %rsp,%rax
0000000000000012        movq    %rax,0xe8(%rbp)
0000000000000016        movq    0xe8(%rbp),%rax
000000000000001a        movq    %rax,0xf0(%rbp)
000000000000001e        movq    0xf0(%rbp),%rax
0000000000000022        popq    %rbp

If I run the resulting binary it crashes with a SIGILL (Illegal Instruction). However if I add a little optimisation to the compile:

$ g++ -O1 test.cc -o test

The generated assembly is much simpler:

0000000000000000        pushq   %rbp
0000000000000001        movq    %rsp,%rbp
0000000000000004        movq    %rsp,%rax
0000000000000007        popq    %rbp
0000000000000008        ret

And the code runs fine. So to the question; is there a more stable to get hold of the stack pointer from C code on Mac OS X? The same code has no problems on Linux.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Michael Barker
  • 14,153
  • 4
  • 48
  • 55
  • 4
    Isn't a _function call_ designed to get the stack address essentially doomed by design? – zneak Oct 18 '11 at 18:51
  • Furthermore, why is that function a *non-static class method*? What could you possibly want a `this` pointer for? – Adam Rosenfield Oct 18 '11 at 19:26
  • Checked the original code and it was declared static and I've updated the question. I'm just working some existing code. Most of the use cases use an inequality so if the value is off slightly, then it is not too much of an issue. The key problem for me is the illegal instruction exception. – Michael Barker Oct 18 '11 at 19:47

4 Answers4

5

The problem with attempting to fetch the stack pointer through a function call is that the stack pointer inside the called function is pointing at a value that will be completely different after the function returns, and therefore you're capturing the address of a location that will be invalid after the call. You're also making the assumption that there was no function prologue added by the compiler on that platform (i.e., both your functions currently have a prologue where the compiler setups up the current activation record on the stack for the function, which will change the value of RSP that you are attempting to capture). At the very least, provided that there was no function prologue added by the compiler, you will need to subtract the size of a pointer on the platform you're using in order to actually get the "true" address to where the stack will be pointing after the return from the function call. This is because the assembly command call pushes the return address for the instruction pointer onto the stack, and ret in the callee will pop that value off the stack. Thus inside the callee, there will at the very least be a return-address instruction that the stack-pointer will be pointing to, and that location won't be valid after the function call. Finally, on certain platforms (unfortunately not x86), you can use the __attributes__((naked)) tag to create a function with no prologue in gcc. Using the inline keyword to avoid a prologue is not completely reliable since it does not force the compiler to inline the function ... under certain low-optimization levels, inlining will not occur, and you'll end up with a prologue again, and the stack-pointer will not be pointing to the correct location if you decide to take it's address in those cases.

If you must have the value of the stack pointer, then the only reliable method will be to use assembly, follow the rules of your platform's ABI, compile to an object file using an assembler, and then link that object file with the rest of the object files in your executable. You can then expose the assembler function to the rest of your code by including a function declaration in a header file. So your code could look like (assuming you're using gcc to compile your assembly):

//get_stack_pointer.h
extern "C" void* get_stack_ptr();

//get_stack_pointer.S
.section .text
.global get_stack_ptr

get_stack_ptr:
    movq %rsp, %rax
    addq $8, %rax
    ret
Jason
  • 31,834
  • 7
  • 59
  • 78
  • It looks like the __attribute__((naked)) is only supported on some platforms and gets ignored by gcc on x86. – Michael Barker Oct 18 '11 at 19:25
  • Thanks for the info ... I'll modify my post. – Jason Oct 18 '11 at 19:33
  • 1
    *completely* different is a bit of an over-statement; as you show, it's only different by a small amount. (But an amount that you can't control from C). BTW, `lea 8(%rsp), %rax` would get the job done in one instruction, replacing mov+add. – Peter Cordes Jan 13 '21 at 10:12
4

Rather than using a register variable with a constraint, you should just write some explicit inline assembler to fetch %esp:

static void *getsp(void)
{
    void *sp;
    __asm__ __volatile__ ("movq %%rsp,%0"
    : "=r" (sp)
    : /* No input */);
    return sp;
}

You can also convert this to a macro using gcc statement expressions:

#define GETSP() ({void *sp;__asm__ __volatile__("movl %%esp,%0":"=r"(sp):);sp;})
caf
  • 233,326
  • 40
  • 323
  • 462
  • To call a function requires a `call` command (pushing a return instruction address on the stack), thus the value of `ESP` will be pointing to a value in the stack that is within the stack-frame of the callee, not the caller . This makes the returned stack-pointer value unusable in the caller since it will not reflect the current address of the stack-pointer once the callee has completed. I see you have accrued a lot of reputation points under the `assembly` tag, so is there something I've missed where this code snippet will return a valid stack-pointer value back to the caller? – Jason Oct 19 '11 at 22:44
  • 2
    @Jason: It's possible of course to convert this to a macro which will obviate that problem (I've updated the answer to include that), but it's actually the case that most times when you want to know the stack pointer, getting a lower-bound on it actually suffices (ie "the current stack is no deeper than this"). Asking for the exact stack pointer within C code is folly anyway, because the C compiler will happily shift the stack pointer around as it sees fit - there's no guarantee that it won't change between when you read it and when you use the value anyway. – caf Oct 20 '11 at 04:33
3

A multi arch version was what I needed recently:

/**
 * helps to check the architecture macros:
 * `echo | gcc -E -dM - | less`
 *
 * this is arm, x64 and i386 (linux | apple) compatible
 * @return address where the stack starts
 */
void *get_sp(void) {
    void *sp;
    __asm__ __volatile__(
#ifdef __x86_64__
        "movq %%rsp,%0"
#elif __i386__
        "movl %%esp,%0"
#elif __arm__
        // sp is an alias for r13
        "mov %%sp,%0"
#endif
        : "=r" (sp)
        : /* no input */
    );
    return sp;
}
5422m4n
  • 790
  • 5
  • 12
0

I do not have a reference for that, but GCC is known to occasionally (often) misbehave in the presence of inline assembly if compilation is not optimized at all. So you should always add the -O1 flag.

As a side-note, what you are trying to do is not very robust in the presence of an optimizing compiler, because the compiler may inline the call to current_stack_pointer() and the returned value may thus be an approximation of the current stack pointer value (not even a lower bound).

Thomas Pornin
  • 72,986
  • 14
  • 147
  • 189