I'm trying to hook arbitrary Linux kernel functions at runtime, why is it not working?

Question

I'm trying to hook arbitrary Linux kernel functions at runtime with a support of an LKM, and extract parameters passed to them.

Here's the userspace part that does the whole orchestration: https://github.com/jafarlihi/ksec/blob/cfeb2c96d9d028ad2c272bb51402d4d41ca971ed/user/src/main.rs#L334

Here's what it does step-by-step:

Calls LKM and gets hooked_addr which is the address of the function we want to hook.
Disassembles the content at hooked_addr and finds the first instruction boundary over 13 bytes (13 because movabs r10,imm + jmp r10 is 13 bytes long).
Saves the code at first 13 bytes + instruction boundary leftover to replaced_insns (so they can be executed before returning to the hooked function from the shim).
Calls LKM and allocates executable memory, saves address to exec_addr.
Calls LKM and gets the address to "shim", which is basically the intercepting function that will be used to extract parameters, saves to shim_addr.
Constructs machine code for movabs r10,${exec_addr}; jmp r10; and saves it to hook_insns(also adds nops till instruction boundary).
Calculates jmp_back_addr by adding hooked_addr + 13 + instruction boundary leftover.
Constructs machine code for movabs r10,${jmp_back_addr}; jmp r10 and saves it to jmp_back_insns.
Constructs machine code for movabs r10,${shim_addr}; call r10 and saves it to shim_insns.
Sends exec_addr, hooked_addr, hook_insns, replaced_insns, jmp_back_insns, and shim_insns to LKM.

Here's how LKM handles it: https://github.com/jafarlihi/ksec/blob/cfeb2c96d9d028ad2c272bb51402d4d41ca971ed/kernel/ksec.c#L434

Here's what it does step-by-step:

Writes hook_insns to hooked_addr.
Writes shim_insns to exec_addr.
Writes replaced_insns to exec_addr.
Writes jmp_back_insns to exec_addr.

So to put it simply, this is what you have in the end as the result:

Some hooked function in kernel with its prologue replaced with jump to some address allocated by LKM.
Some executable space allocated by LKM that simply calls an intercepting function, then executes the prologue (that was replaced) of the hooked function, and then jumps back to the hooked function.

Now the issue is it ~kinda~ works when you try to hook vmalloc instead of netif_rx, the shim gets called and prints something out before vmalloc executes. When you hook netif_rx nothing happens at all. When you hook some other functions in the kernel either nothing will happen or the whole system will freeze and you'll need to reboot.

What am I doing wrong? How come system freezes in most cases?

Why are you reinventing the wheel? [Kprobes](https://docs.kernel.org/trace/kprobes.html) already do exactly what you are trying to do. Anyway, without looking at the code, I would invert point 1 and 4, `hooked_addr` is the last thing you want to write, and you should make sure to do so with preemption and interrupts disabled to avoid catastrophic failure since the write is most definitely not atomic. — Marco Bonelli, Jul 20 '22 at 20:20
The hooked function could jump back to the `replaced_insns`, where it will either execute the jump to the shim again and freeze, or jump into the middle of an instruction. For the latter, the best case scenario is a #UD exception but it could interpret the instructions as something completely different as well. The second scenario is likely the reason why kprobes on x86 use `int 3` which encodes to a single byte `CC` — gizmo, Jul 21 '22 at 14:34

I'm trying to hook arbitrary Linux kernel functions at runtime, why is it not working?

0 Answers0