Why keep a kernel stack for each process in linux?

Question

What's the point in keeping a different kernel stack for each process in linux?

Why not keep just one stack for the kernel to work with?

This question: http://stackoverflow.com/questions/886807/kernel-stack-for-linux-process has some related information — Innot Kauker, Jun 25 '14 at 16:16

artless noise · Accepted Answer · 2021-03-21T16:18:15.723

What's the point in keeping a different kernel stack for each process in linux?

It simplifies pre-emption of processes in the kernel space.

Why not keep just one stack for the kernel to work with?

It would be a night mare to implement pre-emption without seperates stacks.

Separate kernel stacks are not really mandated. Each architecture is free to do whatever it wants. If there was no per-emption during a system call, then a single kernel stack might make sense.

However, *nix has processes and each process can make a system call. However, Linux allows one task to be pre-empted during a write(), etc and another task to schedule. The kernel stack is a snapshot of the context of kernel work that is being performed for each process.

Also, the per-process kernel stacks come with little overhead. A thread_info or some mechanism to get the process information from assembler is needed. This is at least a page allocation. By placing the kernel mode stack in the same location, a simple mask can get the thread_info from assembler. So, we already need the per-process variable and allocation. Why not use it as a stack to store kernel context and allow preemption during system calls?

The efficiency of preemption can be demonstrated by mentioned write above. If the write() is to disk or network, it will take time to complete. A 5k to 8k buffer written to disk or network will take many CPU cycles to complete (if synchronous) and the user process will block until it is finished. This transfer in the driver can be done with DMA. Here, a hardware element will complete the task of transferring the buffer to the device. In the mean time, a lower priority process can have the CPU and be allowed to make system calls when the kernel keeps different stacks per process. These stacks are near zero cost as the kernel already needs to have book keeping information for process state and the two are both keep in an 4k or 8k page.

The example at the end show I/O concurrency for threads or processes. A micro kernel would have the drivers as task (process) and communications would happen between user task and driver task. The driver would `yield` during the DMA operation and then the kernel could schedule some other non-blocked task. Linux is not a micro kernel and drivers are implemented in kernel space. It would be extremely difficult to have thread I/O concurrency without a stack. Think of a web server or data base for some examples of why I/O concurrency is good. — artless noise, Mar 28 '21 at 12:36
A final fact (which I overlooked); most architectures have banked registers that are active in different processor modes. For instance, a user versus supervisor stack pointer. The management of a context switch in kernel space involves switching only the supervisor stack pointer. The banked stack is in kernel accessible memory and the 'thread info' is retrieved with a simple mask. The downside is a kernel stack overflow will overwrite the 'thread info'. — artless noise, Mar 28 '21 at 12:43

score 1 · Answer 2 · answered Jun 25 '14 at 16:17

Why not keep just one stack for the kernel to work with?

In this case only one process/thread would be able to enter the kernel at a time.

Basically, each thread has its own stack, and crossing the user-space to kernel boundary does not change this fact. Kernel also has its own kernel threads (not belonging to any user-space process) and they all have their own stacks.

Why keep a kernel stack for each process in linux?

2 Answers2