According to nftables wiki (and also see this answer here), packet defragmentation happens at priority -400. However, when I put in a chain with nftables with priority level -300:
flush ruleset;
table ip test {
chain prerouting {
type filter hook prerouting priority -300; policy accept;
ip frag-off & 0x1fff != 0 log;
}
}
I clearly see fragmented packets in the kernel logs:
[ 2526.162244] IN=ens7 OUT= MAC=0c:5c:00:2d:b4:03:0c:80:9a:6a:23:01:08:00 SRC=201.201.201.1 DST=200.200.200.2 LEN=1500 TOS=0x00 PREC=0x00 TTL=63 ID=33977 MF FRAG:185 PROTO=UDP
[ 2526.162752] IN=ens7 OUT= MAC=0c:5c:00:2d:b4:03:0c:80:9a:6a:23:01:08:00 SRC=201.201.201.1 DST=200.200.200.2 LEN=961 TOS=0x00 PREC=0x00 TTL=63 ID=33977 FRAG:370 PROTO=UDP
The above code is just a minimal reproducible example; in our actual code, this leads to problems such as only the initial UDP fragment undergoing (raw) NAT, etc.
The kernel module nf_conntrack
is loaded, along with nf_defrag_ipv4
. What am I doing wrong?
EDIT:
I find that this behaviour goes away as soon as I add a rule that depends on conntrack. The rule may be anything at all, e.g.
nft add rule table test prerouting ct state new,invalid,established,related counter accept
It's as if pulling in conntrack tells Linux "I want some conntrack functionalities". So my follow-up question is, is there a way to enable conntrack without needing to add this extra (dummy) rule?