1

We know that system call will call the function entry_SYSCALL_64 in entry_64.S. When I read the source code, I find there are two different types of call after the prepartion of registers, one is entry_SYSCALL64_slow_path and the other is entry_SYSCALL64_fast_path. Can you tell the difference between the two functions?

tyChen
  • 1,404
  • 8
  • 27
  • 1
    It has to do with *ptrace*. If the process is traced or stepped or syscall-emulated Linux will save some extra regs and do the call to syscall handler [here](https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/common.c#L282). Otherwise, it just call the handler from the assembly code. I've never dug into the differences. – Margaret Bloom May 16 '20 at 17:39

1 Answers1

1

Upon entry in entry_SYSCALL_64 Linux will:

  1. Swap gs to get per-cpu parameters.
  2. Set the stack from the parameters of above.
  3. Disable the IRQs.
  4. Create a partial pt_regs structure on the stack. This saves the caller context.
  5. If the current task has _TIF_WORK_SYSCALL_ENTRY or _TIF_ALLWORK_MASK set, it goes to the slow path.
  6. Enter the fast path otherwise.

_TIF_WORK_SYSCALL_ENTRY is defined here with a comment stating:

/*
 * work to do in syscall_trace_enter().  Also includes TIF_NOHZ for
 * enter_from_user_mode()
 */

_TIF_ALLWORK_MASK does not seems to be defined for x86, a definition for MIPS is here with a comment stating:

/* work to do on any return to u-space */

Fast path

Linux will:

  1. Enable the IRQs.
  2. Check if the syscall number is out of range (note the pt_regs struct was already created with ENOSYS for the value of rax).
  3. Dispatch to the system call with an indirect jump.
  4. Save the return value (rax) of the syscall into rax in the pt_regs on the stack.
  5. Check again if _TIF_ALLWORK_MASK is set for the current task, if it is it will jump to the slow return path.
  6. Restore the caller context and issue a sysret.

Slow return path

  1. Save the registers not saved before in pt_regs (rbx, rbp, r12-r15).
  2. Call syscall_return_slowpath, defined here.

Note that point 2 will end up calling trace_sys_exit.

Slow path

  1. Save the registers not saved before in pt_regs (see above)
  2. Call do_syscall_64, defined here.

Point 2 will call syscall_trace_enter.


So the slow vs fast path has to do with ptrace. I haven't dug into the code but I suppose the whole machinery is skipped if ptrace is not needed for the caller.
This is indeed an important optimization.

Margaret Bloom
  • 41,768
  • 5
  • 78
  • 124
  • Hello, thanks for your reply, it's really clear. I have one more question, is there any convenient way to debug the code step by step to learn it more deeply? – tyChen May 17 '20 at 01:58
  • 1
    @tyChen Yes, see [here](https://www.kernel.org/doc/html/v4.14/dev-tools/gdb-kernel-debugging.html). I've never done that, though. There are even graphical front-ends for GDB. – Margaret Bloom May 17 '20 at 07:00