-1

I am new in ebpf & xdp topic and want to do learn it. My question is how to use ebpf filter to filter the packet on specific payload matching? for example, if the data(payload) of the packet is 1234 its passes to the network stack otherwise it blocks the packet. I reached payload length. For example, if I want to match the message payload length it works fine but when I start matching the payload characters I got an error. here is my code:

int ret_val;
unsigned long payload_offset;
unsigned long payload_size;
const char *payload = "test";
struct ethhdr *eth = data;

if ((void*)eth + sizeof(*eth) <= data_end) {
    struct iphdr *ip = data + sizeof(*eth);
    if ((void*)ip + sizeof(*ip) <= data_end) {
        if (ip->protocol == IPPROTO_UDP ) {
            struct udphdr *udp = (void*)ip + sizeof(*ip);
            if ((void*)udp + sizeof(*udp) <= data_end) {
                if (udp->dest == ntohs(5005)) {
                    payload_offset = sizeof(struct udphdr);
                    payload_size = ntohs(udp->len) - sizeof(struct udphdr);
                    unsigned char *s = (unsigned char *)&payload_size;

                    if (ret_val == __builtin_memcmp(s,payload,4) == 0) {
                        return XDP_DROP;
                    }
                }
            }
        }
    }
}

The error had removed but unable to compare the payload... I am sending the UDP message from python socket code. If I compare the payload length it works fine.

Qeole
  • 8,284
  • 1
  • 24
  • 52
Linux baby
  • 21
  • 1
  • 5
  • Try replacing `memcmp()` with `__builtin_memcmp()` (or removing it entirely if you need to compare only two bytes, just use `==`). If it doesn't work, it would be helpful to provide a larger chunk of code. Note that `unsigned char *s=(unsigned char *)&payload_size;` means `s` points to your variable holding the size of the payload, is that what you want? And not sure how you initialise `payload`. Or how you do the checks on lengths. – Qeole Jun 08 '20 at 16:15
  • `ret_val == __builtin_memcmp(s,payload,4)` looks incorrect, you probably want a single `=` here. Regarding code style, I'd avoid comparisons in `if ()` conditions, and if I may, I'd use early returns more often and avoid nesting all those `if`s. – Qeole Jun 08 '20 at 17:01
  • I tried with if (ret_val = __builtin_memcmp(s,payload,4) == 0) but still unable to match the payload. – Linux baby Jun 08 '20 at 17:12
  • Good ol'`bpf_trace_printk()` to the rescue then, you'll have to debug your program :). I'd start by looking at what is in `*s` and `*payload`, to see if you have what you expect. Not sure how I can help more without full code and error observed etc. – Qeole Jun 08 '20 at 20:41

2 Answers2

2

What did you try? You should probably read a bit more about eBPF to try to understand how to process packets, the basic example you give does not sound too complicated.

Basically you would have to parse the headers to see where your payload begins. Simple BPF parsing examples might help you understand the principles:

  1. Start from beginning of header (e.g. Ethernet at first)
  2. Check packet is long enough to hold the header (or you would risk an out-of-bound access when trying to access the upper layers otherwise)
  3. Add header length to get the offset of your next header (e.g. IPv4, then e.g. TCP...)
  4. Rinse and repeat.

In your case you would process all headers until you get the offset of the data payload. Note that this is trivial if the traffic you try to match always has the same headers (e.g. always IPv4 and UDP), but you get more cases to sort out if there is a mix (IPv4 + IPv6, encapsulation, IPv4 options...).

Once you have the offset for your data, just compare data at this offset to your pattern (that you may hardcode in the BPF program or get from a BPF map, depending on your use case). Note that you do not have access to strcmp(), but __builtin_memcmp() is available if you need to compare more than 64 bits.

(All the above applying of course to a C program that you would compile into an object file containing eBPF instructions with the LLVM back-end.)

If you were to search for a string at an arbitrary offset in the payload, know that eBPF now supports (bounded) loops since kernel 5.3 (if I remember correctly).

Qeole
  • 8,284
  • 1
  • 24
  • 52
1

Your edit is pretty much a new question, so here an updated answer. Please consider opening a new question instead in the future.

There are a number of things that are wrong in your program. In particular:

1|    payload_offset = sizeof(struct udphdr);
2|    payload_size = ntohs(udp->len) - sizeof(struct udphdr);
3|    unsigned char *s = (unsigned char *)&payload_size;
4|
5|    if (ret_val == __builtin_memcmp(s, payload, 4) == 0) {
6|        return XDP_DROP;
7|    }
  • On line 1, your payload_offset variable is not an offset, it just contains the length of the UDP header. You would need to add that to the start of the UDP header to get the actual payload offset.
  • Line 2 is fine.
  • Line 3 does not make any sense! You make s (that you later compare to your pattern) point towards the size of the payload? (a.k.a “I told you so in the comments! :)”). Instead, it should point to... the beginning of the payload, maybe? So, basically, data + payload_offset (once offset is fixed).
  • Between lines 3 and 5, the check on payload length is missing. When you try to access your payload in s (__builtin_memcmp(s, payload, 4)), you try to compare four bytes of packet data; you must ensure that the packet is long enough to read those four bytes (just as you checked the length each time before you read from an Ethernet, IP or UDP header field).
  • While at it, we can also check that the length of the payload is equal to the length of the pattern to match, and exit if they differ without having to compare the bytes.
  • Line 5 has a == instead of =, as discussed in the comments. Easy to fix. However, I had no luck with __builtin_memcmp() for your program, it seems LLVM does not want to inline it and turns it into a failing function call. Never mind, we can work without it. For your example, you can cast to int and compare the four-byte long values directly. For longer patterns, and for recent kernels (or by unrolling if pattern size is fixed), we can use bounded loops.

Here is a amended version of your program, that works on my setup.

#include <arpa/inet.h>
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/udp.h>

int xdp_func(struct xdp_md *ctx)
{
    void *data_end = (void *)(long)ctx->data_end;
    void *data = (void *)(long)ctx->data;
    char match_pattern[] = "test";
    unsigned int payload_size, i;
    struct ethhdr *eth = data;
    unsigned char *payload;
    struct udphdr *udp;
    struct iphdr *ip;

    if ((void *)eth + sizeof(*eth) > data_end)
        return XDP_PASS;

    ip = data + sizeof(*eth);
    if ((void *)ip + sizeof(*ip) > data_end)
        return XDP_PASS;

    if (ip->protocol != IPPROTO_UDP)
        return XDP_PASS;

    udp = (void *)ip + sizeof(*ip);
    if ((void *)udp + sizeof(*udp) > data_end)
        return XDP_PASS;

    if (udp->dest != ntohs(5005))
        return XDP_PASS;

    payload_size = ntohs(udp->len) - sizeof(*udp);
    // Here we use "size - 1" to account for the final '\0' in "test".
    // This '\0' may or may not be in your payload, adjust if necessary.
    if (payload_size != sizeof(match_pattern) - 1) 
        return XDP_PASS;

    // Point to start of payload.
    payload = (unsigned char *)udp + sizeof(*udp);
    if ((void *)payload + payload_size > data_end)
        return XDP_PASS;

    // Compare each byte, exit if a difference is found.
    for (i = 0; i < payload_size; i++)
        if (payload[i] != match_pattern[i])
            return XDP_PASS;

    // Same payload, drop.
    return XDP_DROP;
}
Qeole
  • 8,284
  • 1
  • 24
  • 52
  • if i want to compare it with more pattern then what should i do ? i can't use strcmp to compare each pattern. is there any way to do this ? – Linux baby Dec 10 '20 at 22:07
  • What's the difference between one and several comparisons? Just more the code you use to compare the pattern to a function (inline or not, depending on your kernel requirements)? – Qeole Dec 10 '20 at 23:19