How unstable are networking-related kprobes in practice?

Question

I am very new to the world of BPF development and need to use kprobes in my BPF program so that i can properly detect and gather the PIDs for processes attempting to send packets over the network. I want to deploy this BPF program with my userspace app, and my userspace app runs on a variety of linux versions and distributions - though all relatively recent.

I know that the kprobe mechanism is not officially stable, but how likely is my program to break in practice? I am hooking functions like tcp_connect and ip4_datagram_connect. I would have thought these functions would not change much between kernel versions so it should be safe to more or less rely on them? Or is there something I am misunderstanding?

Can I ship an app that relies on these particular (tcp/udp) kprobes without worrying too much about compatibility or stability?

score 2 · Accepted Answer · answered May 20 '20 at 06:33

The answer really depends on the function you want to trace and there's no way to know for sure. That function's prototype might have not changed at all since Linux 2.x and disappear in the next release.

In practice, I've found that e.g., the functions bcc trace with kprobes are quite stable. Only a few of bcc's tools required changes to handle new kernel versions that went out since their creation. That is also because the tool writers were careful to use more "central" functions that are less likely to change.

From a quick look, I would consider the two functions you cited, tcp_connect and ip4_datagram_connect, to be such "central" functions. For one thing, they are both exported in the symbol table.

Brilliant answer, thanks fellow dutchie, maybe one day we'll meet during king's day ;) You're awesome! — horseyguy, May 21 '20 at 00:52

score 1 · Answer 2 · answered May 20 '20 at 08:59

Complement to pchaigno's answer: bcc also works great for portability because BPF programs are compiled at bcc's runtime, just before being loaded into the kernel (so you are sure to use the function definitions for the current running kernel).

To work without bcc but with the same kind of guarantees regarding portability of tracing programs, I would recommend looking at the CO-RE mechanism (Compile-Once, Run-Everywhere) described in details in this blog post. CO-RE requires in particular that the kernel has been built with BTF debug information. This information is used when loading the program to make sure it interfaces correctly with the kernel.

CO-RE does not fully eliminate the risk of kernel changes breaking BPF kprobes, but would work around some changes in function or struct definitions.

How unstable are networking-related kprobes in practice?

2 Answers2