What is the proper way to pass traffic using GRE tunnel (or any vNIC) using eBPF?

Question

I have a GRE link set up on a VM using the following commands: ip tunnel add tap0 mode gre local <foo> remote <bar> and the counterpart on a different VM (in the same subnet) is exactly the same except foo<->bar

I have created and an eBPF tc program that calls bpf_clone_redirect to copy packets to the tunnel device on one of the hosts (i.e duplicating the traffic to tap0 link):

SEC("tc")

SEC("tc")
int tc_ingress(struct __sk_buff *skb) {
    __u32 key = 0;
    struct destination *dest = bpf_map_lookup_elem(&destinations, &key);

    if (dest != NULL) {
        struct bpf_tunnel_key key = {};
        int ret;

        key.remote_ipv4 = dest->destination_ip;
        key.tunnel_id = dest->iface_idx;
        key.tunnel_tos = 0;
        key.tunnel_ttl = 64;
        ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0);
        if (ret < 0) {
            // error setting the tunnel key, do not redirect simply continue.
            return TC_ACT_OK;
        }
        // zero flag means that the socket buffer is 
        // cloned to the iface egress path.
        bpf_clone_redirect(skb, dest->iface_idx, 0);
    }
    return TC_ACT_OK;
}
}

I see the traffic passed to the GRE link tap0 by running tcpdump -i tap0 but I dont see the traffic on its remote counterpart...

Is it necessary in such scenario to define an address for the device (ala ip addr <> dev tap0)?
What is the proper way of defining such tunnels?
If I have iptable rules set up on eth0 would it block traffic sent to the GRE link? If "yes" is there a way to bypass those?

Do you see the packets on the native device (the device on which the GRE tunnel is created)? If yes, are the packets encapsulated? If not, did you try tracing where the packets are dropped with cilium/pwru? — pchaigno, Mar 11 '22 at 14:22
@pchaigno 1. Nope I dont see the packets on `eth0` - To be clear if I am duplicating packets *from* `eth0` to `tap0`, in that case, I should see *both* packets (original and dup) on `eth0` (assuming its the underlying device) ? 2. How can I use cilium/pwru to trace where they are dropped? — Nimrodshn, Mar 11 '22 at 14:43
You should be able to trace the packet with the various filters. See https://github.com/cilium/pwru#usage. A `kfree_skb` function should be shown if packets are dropped. I suspect you just need to set the tunnel ID, TTL, and outer destination IP via `bpf_skb_set_tunnel_key`. — pchaigno, Mar 11 '22 at 14:52
@pchaigno is the tunnel id the interface id of `tap0` (the GRE tunnel) ? — Nimrodshn, Mar 11 '22 at 17:20
For VXLAN, tunnels it's the Virtual Network Identifier (VNI). For GRE, I'm not sure. It may be unused or it could be the optional Key. I think the outer destination IP is what matters most here. — pchaigno, Mar 11 '22 at 21:34
@pchaigno So now I am seeing the packets going out of eth0 via `sudo tcpdump -i eth0 -Q out`, for example: `IP 10.0.100.5 > 10.0.100.8: GREv0, length 56: IP XXX.XXX.XXX.XXX > 10.0.100.5.24497: Flags [.], ack 22842250, win 1687, options [nop,nop,TS val 904755902 ecr 2911175870], length 0`. But for some reason I'm not seeing it on the paired machine (10.0.100.8), not even on `eth0` for that machine.. Thoughts? — Nimrodshn, Mar 12 '22 at 17:12
If you see them on the native interface of one machine but not of the other, then the packets must be dropped in between. Any firewall rules in between? I understand those are two VMs on the same machine, right? How are they connected? — pchaigno, Mar 13 '22 at 13:35
@pchaigno not sure if their on the same machine, these are (Azure) cloud VM's on the same subnet. Also I've seen a discussion on cillium's eBPF slack channel with somewhat similar issue (Except traffic was redirected to a veth and not a GRE tunnel) where the writer stated: "Solved: Missing TCA_BPF_FLAG_ACT_DIRECT in the netlink command.". Would you know where and how this flag is used? should I use it as well somewhere? — Nimrodshn, Mar 14 '22 at 07:08
@pchaigno I have now found that GRE isn't supported on Azure VNETs :( .. This explains this behavior (https://learn.microsoft.com/en-us/answers/questions/496591/does-azure-virtual-network-support-gre.html) — Nimrodshn, Mar 14 '22 at 07:59

Nimrodshn · Answer 1 · 2022-03-14T09:49:26.950

0

For anyone trying to route to a GRE tunnel please use the bpf_skb_set_tunnel_key struct provided. See examples in https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/samples/bpf/tc_l2_redirect_kern.c).

Per my use case -

For anyone trying to create a GRE tunnel on Azure VM's please note that this is, currently, not possible per https://learn.microsoft.com/en-us/answers/questions/496591/does-azure-virtual-network-support-gre.html

edited Mar 14 '22 at 09:49

answered Mar 14 '22 at 09:18

Nimrodshn

859
3
13
29

Note this doesn't answer the question you posted, which made no mention of Azure. It might be best to keep the original question, mention `bpf_skb_set_tunnel_key` as the solution, and optionally open another question for the Azure aspect. – pchaigno Mar 14 '22 at 09:35

What is the proper way to pass traffic using GRE tunnel (or any vNIC) using eBPF?

1 Answers1