Purpose of saving an incoming pthread address on the stack before syscall in MUSL's x86_64 __syscall_cp_asm wrapper?

Question

 1  __syscall_cp_asm:
 2  __cp_begin:
 3      mov (%rdi),%eax
 4      test %eax,%eax
 5      jnz __cp_cancel

 6      mov %rdi,%r11

 7      mov %rsi,%rax
 8      mov %rdx,%rdi
 9      mov %rcx,%rsi
10      mov %r8,%rdx
11      mov %r9,%r10
12      mov 8(%rsp),%r8
13      mov 16(%rsp),%r9

14      mov %r11,8(%rsp)

15      syscall
16  __cp_end:
17      ret
18  __cp_cancel:
19      jmp __cancel

I am curious what the purpose of lines 6 and 14 is (renumbered from the linked source).

From what I understand the beginning of the code tests the target of the pointer passed as the 1st argument (lines 3–5), line 6 then moves the pointer to r11 and line 14 then moves it to the place on the stack that was used to pass the 7th argument.

This doesn't seem useful. Do these moves accomplish anything?

Where is this called from? Could the caller also be hand-written asm that is going to look at `(%rsp)` after this function returns? RDI seems to be some kind of internal-use pointer, so the call number is in RSI and the syscall args in later slots. — Peter Cordes, Sep 20 '20 at 19:43
I agree that if the caller is assuming a normal x86-64 SysV calling convention, storing a copy of the incoming RDI over a stack arg has no meaning; it's not something caller would look at. Unless the return address is actually the address of another function? But that would misalign the stack so it's not something you can do in fully standard x86-64 SysV. — Peter Cordes, Sep 20 '20 at 19:47
@PeterCordes It's called solely from https://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_cancel.c#n33 — Petr Skocik, Sep 20 '20 at 19:47
@PeterCordes Thanks. I'm adapting this to my own purpose, and while making modifications, these two lines struck me as odd. It seems to work if I delete them, but I thought I'd ask, hoping perhaps I'd learn something new ;). — Petr Skocik, Sep 20 '20 at 19:50
"seems to work" - unless you know what pthread cancellation points are and have constructed a test-case that would detect if they're not handled perfectly, that doesn't tell us much. (I don't really understand pthread cancellation points; I know glibc checks some stuff in its syscall wrappers, too, but I don't really understand why). — Peter Cordes, Sep 20 '20 at 19:53
However, the way it's called from C makes me wonder if it's a bug. Passing `&self->cancel` as the RDI arg and then reading `self->cancel` after the call makes me wonder if the code was supposed to have been storing something to `(%rdi)` (or `(%r11)`). But that seems too different; storing over the incoming `y` is obviously not doing that, probably not a bug. Maybe it's putting it in a known place relative to the user-space stack so something tracing system calls can find that pointer to library stuff? It's not in the path that leads to the `jmp __cancel` tailcall so it's not an arg for that. — Peter Cordes, Sep 20 '20 at 19:59
@PeterCordes It's about the begin/end labels. Cancellations (even deferred ones) need to be able to break blocking syscalls so they need to be signal based. If the signal arrives before a potentially indefinitely blocking syscall, that point is a cancellation point & the syscall must not be entered. The test at the beginning of the assembly prevents that but there's a TOCTOU there, that's closed by exporting the labels. If the cancel signal handler detects the code is between the labels it forces the instruction pointer to jump to the cancellation, preventing the thread from getting blocked. — Petr Skocik, Sep 20 '20 at 20:04

score 5 · Accepted Answer · answered Sep 20 '20 at 22:14

This is to support pthread cancellation points; a signal handler can later look at the stack.

The commit log for the commit that introduced this code explains that storing a pointer at a known place on the stack before a syscall makes it possible for the "cancellation signal handler" to determine "whether the interrupted code was in a cancellable state." (The initial version of that code also saves the address of the syscall instruction, but later commits changed that.)

The first arg (which that asm function stores on the stack) comes from its C caller, __syscall_cp_c, which passes __syscall_cp_asm(&self->cancel, nr, u, v, w, x, y, z);, where self came from __pthread_self().

You're correct, overwriting the caller's stack arg with a different incoming arg is not "visible" to a C caller following the x86-64 System V ABI. (A callee owns its stack args; the caller has to assume they've been overwritten so compiler generated code will never read that memory location as an output). So we needed to look for alternate explanations.

Using 2 total mov instructions to copy the incoming RDI into the 8(%rsp) after reading that memory location is I think necessary. We can't delay the mov %rdx,%rdi until after the load because we need to free up RDX to hold R8, to free up R8 to hold the load. You could avoid touching an "extra" register by using R10 before it's used to load the other arg, but it would still take at least 2 instructions.

Or the arg order could be optimized to pass that pointer in a later arg, perhaps passing the call number last and the pthread pointer in the last register arg (minimal shuffling but avoiding need for a double dereference for that test/branch) or the first stack arg (where you want it anyway). Or match the arg order of the __syscall wrapper that takes nr first with no pthread pointer.

score -1 · Answer 2 · answered Sep 20 '20 at 20:05

-1

The code in lines 7 - 14 loads the parameters to the syscall in parameter order. Since RDI is loaded at line 8, its value is saved in R11 so that it can be written to parameter 8 (on the stack) at line 14.

In hand-written assembly code, it can be easier to understand and maintain by keeping things organized like this, which outweighs the cost of an extra move instruction.

answered Sep 20 '20 at 20:05

prl

11,716
2
13
31

I don’t think this answer is right—in fact I tried to delete it before you accepted it. If the value stored on the stack were a parameter to the syscall, then there would need to be a `sub rsp, 0x18`. – prl Sep 20 '20 at 20:10
Yeah, I didn't get what you meant either. Was just trying to `mov` on. :D – Petr Skocik Sep 20 '20 at 20:13
@PSkocik: You should probably un-accept this. Agreed this doesn't make sense; x86-64 Linux doesn't take any system-call args on the user-space stack. System calls have at most 6 args, the number the ABI has room for. But @ prl: a function is allowed to overwrite its incoming stack args. If there was a stack arg to the system call, it would be valid to overwrite the incoming `y` with the incoming `&self->cancel`, after loading `y` into the right register. (That's why it needs an r11 tmp, to delay the store until after loading `8(%rsp)`). So that's not why it's wrong. – Peter Cordes Sep 20 '20 at 20:23
1

@Peter, I meant that if there *were* a system call that took 8 arguments, this function would have to allocate space for them, otherwise this function’s return address would be within the parameter space of the system call. – prl Sep 20 '20 at 21:29
Ah I see what you mean. Unless it worked like the i386 FreeBSD system-calling convention, where the kernel looks for args on the user-space stack starting at ESP+4, exactly so that libc wrapper functions like `write` can just use `int 0x80` without having to shuffle args around or save / restore the return address into a register that system calls won't clobber. (IIRC, i386 FreeBSD / MacOS takes the call number in EAX, so a `write` wrapper could be something like `mov $something, %eax` / `int 0x80` / `jc set_errno` / `ret`, with user-space stack args becoming system call args. – Peter Cordes Sep 20 '20 at 21:38

Purpose of saving an incoming pthread address on the stack before syscall in MUSL's x86_64 __syscall_cp_asm wrapper?

2 Answers2