3

I have an application which runs on Linux (2.6.38.8), using libpcap (>1.0) to capture packets streamed at it over Ethernet. My application uses close to 100% CPU and I am unsure whether I am using libpcap as efficiently as possible.

I am battling to find any correlation between the pcap tunables and performace.

Here is my simplified code (error checking etc. omitted):

// init libpcap
pcap_t *p = pcap_create("eth0", my_errbuf);
pcap_set_snaplen(p, 65535);
pcap_set_promisc(p, 0);
pcap_set_timeout(p, 1000);
pcap_set_buffer_size(p, 16<<20); // 16MB
pcap_activate(p);

// filter
struct bpf_program filter;
pcap_compile(p, &filter, "ether dst 00:11:22:33:44:55", 0, 0);
pcap_setfilter(p, &filter);

// do work
while (1) {
    int ret = pcap_dispatch(p, -1, my_callback, (unsigned char *) my_args);
    if (ret <= 0) {
        if (ret == -1) {
            printf("pcap_dispatch error: %s\n", pcap_geterr(p));
        } else if (ret == -2) {
            printf("pcap_dispatch broken loop\n");
        } else if (ret == 0) {
            printf("pcap_dispatch zero packets read\n");
        } else {
            printf("pcap_dispatch returned unexpectedly");
        }
    } else if (ret > 1) {
        printf("processed %d packets\n", ret);
    }
}

The result when using a timeout of 1000 miliseconds, and buffer size of 2M, 4M and 16M is the same at high data rates (~200 1kB packets/sec): pcap_dispatch consistently returns 2. According to the pcap_dispatch man page, I would expect pcap_dispatch to return either when the buffer is full or the timeout expires. But with a return value of 2, neither of these conditions should be met as only 2kB of data has been read, and only 2/200 seconds have passed.

If I slow down the datarate (~100 1kB packets/sec), pcap_dispatch returns between 2 and 7, so halving the datarate affects how many packets are processed per pcap_dispatch. (I think the more packets the better, as this means less context switching between OS and userspace - is this true?)

The timeout value does not seem to make a difference either.

In all cases, my CPU usage is close to 100%.

I am starting to wonder if I should be trying the PF_RING version of libpcap, but from what I've read on SO and libpcap mailing lists, libpcap > 1.0 does the zero copy stuff anyway, so maybe no point.

Any ideas, pointers greatly appreciated! G

Gman
  • 31
  • 2
  • 2
    `while(1)` is your culprit, not libpcap. Find an example that uses pcap_loop(). – Iron Savior May 22 '13 at 15:59
  • No, that shouldn't make a difference. `pcap_dispatch()` blocks, so a while loop repeatedly calling `pcap_dispatch()` shouldn't be that different from just calling `pcap_loop()`. Two nested loops rather than one loop shouldn't make much difference here, unless the callback is doing little or no work. –  Mar 26 '15 at 21:12
  • Just for the record, `cnt` is the *maximum* number of packets read before returning, not the *minimum*. What it guarantees is that you won't ever see `> cnt` packets in one call, not that you will receive `cnt` packets unless timed out. – Oppen Apr 03 '20 at 18:28

0 Answers0