Can you use the red zone with/across syscalls?

Question

Consider this GNU Assembler program, that copies one byte at a time from stdin to stdout, with a delay of one second between each:

#include <sys/syscall.h>

.global _start

_start:
    movq $1, -16(%rsp)
    movq $0, -8(%rsp)
    movl $1, %edx

.again:
    xorl %edi, %edi
    leaq -17(%rsp), %rsi
    movl $SYS_read, %eax
    syscall

    cmpq $1, %rax
    jne .end

    leaq -16(%rsp), %rdi
    xorl %esi, %esi
    movl $SYS_nanosleep, %eax
    syscall

    movl $1, %edi
    leaq -17(%rsp), %rsi
    movl $SYS_write, %eax
    syscall

    jmp .again

.end:
    xorl %edi, %edi
    movl $SYS_exit_group, %eax
    syscall

It passes pointers to the red zone to syscalls, for both inputs and outputs, and also expects the rest of the red zone to be preserved across unrelated syscalls. Is this a safe use of the red zone that's guaranteed to always work, or is it UB that just happened to appear to work in my test?

`syscall` has no reason to touch the user-space stack, so I'd definitely expect it to work reliably. I haven't looked at official documentation for a while, though, if there is any besides the ABI doc. — Peter Cordes, Jul 03 '22 at 20:32

score 2 · Accepted Answer · answered Jul 03 '22 at 21:33

Is this a safe use of the red zone that's guaranteed to always work, or is it UB that just happened to appear to work in my test?

It's guaranteed to be safe by the kernel developers.

In general (to guard against deliberately malicious software) CPUs are designed so that when you switch from a lower privilege level (user-space) to a higher privilege level (kernel) the CPU forces a stack switch (e.g. from "untrusted user-space stack" to "more trusted kernel stack"); and CPU also does the reverse (switching stacks when returning from higher privilege level to lower privilege level).

This makes it easy for kernel developers to ensure that system calls (and IRQs, etc) don't interfere with a user-space thread's red zone; but it doesn't necessarily prevent a kernel from interfering with a user-space thread's red zone (a kernel could do extra work for no reason to interfere, if the kernel developer wanted their kernel to be awful).

The `syscall` instruction does not itself do a stack switch. But it's designed (along with `swapgs`) to make it possible for kernel code to switch stacks without the possibility of an interrupt using the RSP value that user-space left. The CPU itself does stack switching for you on user-space `int 0x80`, page faults and other exceptions, and on hardware interrupts. — Peter Cordes, Jul 03 '22 at 23:50
But yeah, there's no reason for the kernel to mess up user-space stack memory, and any such use would likely be a security bug (since another user-space thread could modify that memory while it was being used as a kernel stack, unless the kernel was just storing non-sensitive data there for no reason.) — Peter Cordes, Jul 03 '22 at 23:52

Can you use the red zone with/across syscalls?

1 Answers1