0

We are trying to replay a pcap file (smallFlows.pcap) over a 10 GbE connection using tcpreplay and capture all the packets, recording the source and destination ports/IP addresses. However, there is significant packet loss. At 3 Gbps, we are losing around 15% of the packets sent. Even at 1 Gbps, we are losing 7%. Our sniffer program is written in C using netmap-libpcap and is a modified version of sniffex.c.

We removed all the print statements when testing. We tried changing the snap length and buffer size, but that only slightly improved the packet loss rate. We also set the CPU cores on both the sender and receiver to performance mode to maximize the clock speeds (around 2.67 GHz on the receiver), but that had no effect. According to top, the CPU usage was fairly low - around 15%.

The receiver has an Intel Core i7 processor. The sender is running Ubuntu 12.04.3 LTS (linux kernel 3.8.13) and the receiver is running Ubuntu 12.04 (linux kernel 3.2.0-23-generic).

What can we do to ensure that all the packets are received?

Here is the main function:

int main(int argc, char **argv)
{

  char *dev = NULL;         /* capture device name */
  char errbuf[PCAP_ERRBUF_SIZE];        /* error buffer */
  pcap_t *handle;               /* packet capture handle */

  char filter_exp[] = "ip";     /* filter expression [3] */
  bpf_u_int32 mask;         /* subnet mask */
  bpf_u_int32 net;          /* ip */
  int num_packets = 10;         /* number of packets to capture */

  print_app_banner();
  printf(pcap_lib_version());
  /* check for capture device name on command-line */
  if (argc == 2) {
    dev = argv[1];
  }
  else if (argc > 2) {
    fprintf(stderr, "error: unrecognized command-line options\n\n");
    print_app_usage();
    exit(EXIT_FAILURE);
  }
  else {
    /* find a capture device if not specified on command-line */
    dev = pcap_lookupdev(errbuf);
    if (dev == NULL) {
        fprintf(stderr, "Couldn't find default device: %s\n",
            errbuf);
        exit(EXIT_FAILURE);
    }
  }

  /* get network number and mask associated with capture device */
  if (pcap_lookupnet(dev, &net, &mask, errbuf) == -1) {
        fprintf(stderr, "Couldn't get netmask for device %s: %s\n",
        dev, errbuf);
    net = 0;
    mask = 0;
  }

  /* print capture info */
  printf("Device: %s\n", dev);
  printf("Number of packets: %d\n", num_packets);
  printf("Filter expression: %s\n", filter_exp);


  /* open capture device */
  //handle = pcap_open_live(dev, SNAP_LEN, 1, 1000, errbuf);
  handle = pcap_create(dev, errbuf);
  if (handle == NULL) {
    fprintf(stderr, "Couldn't open device %s: %s\n", dev, errbuf);
    exit(EXIT_FAILURE);
  }

  pcap_set_snaplen(handle, 1518);
  pcap_set_promisc(handle, 1);
  pcap_set_timeout(handle, 1000);
  pcap_set_buffer_size(handle, 20971520);
  pcap_activate(handle);


  /* make sure we're capturing on an Ethernet device [2] */
  if (pcap_datalink(handle) != DLT_EN10MB) {
    fprintf(stderr, "%s is not an Ethernet\n", dev);
    exit(EXIT_FAILURE);
  } 

  /* now we can set our callback function */
  pcap_loop(handle, 0/*num_packets*/, got_packet, NULL);

  /* cleanup */
  pcap_close(handle);

  printf("\nCapture complete.\n");

  return 0;
}

Here is the packet handler code called by pcap_loop():

/*
* dissect packet
*/
void got_packet(u_char *args, const struct pcap_pkthdr *header, const u_char *packet)
{

  static int count = 1;                   /* packet counter */

  /* declare pointers to packet headers */
  const struct sniff_ethernet *ethernet;  /* The ethernet header [1] */
  const struct sniff_ip *ip;              /* The IP header */
  const struct sniff_tcp *tcp;            /* The TCP header */
  const char *payload;                    /* Packet payload */

  int size_ip;
  int size_tcp;
  int size_payload;

  //printf("\nPacket number %d:\n", count);
  count++;
  //if(count >= 2852200)
  printf("count: %d\n", count);
  /* define ethernet header */
  ethernet = (struct sniff_ethernet*)(packet);

  /* define/compute ip header offset */
  ip = (struct sniff_ip*)(packet + SIZE_ETHERNET);
  size_ip = IP_HL(ip)*4;
  if (size_ip < 20) {
    //printf("   * Invalid IP header length: %u bytes\n", size_ip);
    return;
  }

  /* define/compute tcp header offset */
  tcp = (struct sniff_tcp*)(packet + SIZE_ETHERNET + size_ip);
  size_tcp = TH_OFF(tcp)*4;

  /* compute tcp payload (segment) size */
  size_payload = ntohs(ip->ip_len) - (size_ip + size_tcp);

  return;
}

Thank you for your help.

  • What version of libpcap is the program using? I.e., what does `printf(pcap_lib_version());` print? –  Mar 20 '15 at 06:40
  • We are using libpcap version 1.1.1. However, it has been modified to support the netmap API. – user2471905 Mar 20 '15 at 16:03
  • "However, it has been modified to support the netmap API." So it's *not* using PF_PACKET sockets, but netmap - *if* you've added the netmap kernel module and are loading it. Presumably you've done that. –  Mar 20 '15 at 18:07
  • We've added and loaded the netmap kernel module. We know that part is working correctly because we were able to get line rates using netmap's pkt-gen example program. – user2471905 Mar 20 '15 at 18:23

1 Answers1

1

What was the CPU usage? Was it 15% of a single core or 15% of all cores? If it was 15% of all cores, and you have 8 cores, it is actually over 100% of a single core. So, this could explain then why your single-threaded application fails to capture all packets.

If you are unable to receive all packets using the pcap library, there is really no other way than to try using another packet reception mechanism. Linux has PF_PACKET sockets which could possibly help in your situation. According to this answer: libpcap or PF_PACKET? ...libpcap should be preferred over PF_PACKET as libpcap is more portable and uses internally the memory-mapped mechanism of PF_PACKET which is tricky to use.

According to the answer, libpcap uses the memory-mapped mechanism of PF_PACKET. You could try using PF_PACKET manually in a non-memory-mapped mode so your packet access mechanism would be different then. If there's a bug somewhere in the memory-mapped mode, it may result in packet loss.

Have you tried recording the packet capture with tcpdump? Tcpdump internally uses libpcap, so if tcpdump is able to capture all packets and your software is unable to do so, it gives evidence that the bug is in your software and it is not an inherent limitation in libpcap.

Community
  • 1
  • 1
juhist
  • 4,210
  • 16
  • 33
  • The CPU usage was 15% of a single core. We tried testing the sniffer on a single core using taskset and saw no change in the loss rate. We just tried recording the packet capture with tcpdump, and the loss rate is almost identical. – user2471905 Mar 20 '15 at 18:26
  • So, this is a high likelihood that the problem is in libpcap and not your code. Unfortunately it may mean it's not possible to avoid the problem. You could try using PF_PACKET directly but it requires some additional work which may turn out to be non-useful. Have you tried recording real TCP traffic to see if packets are missing? Wireshark should easily show the missing TCP packets. It may be the case the problem is caused by tcpreplay replaying the traffic too fast, and perhaps some tcpreplay option could help. Have you tried the --pps option of tcpreplay? – juhist Mar 20 '15 at 18:36
  • We've only tried sending with pkt-gen and tcp-replay, adjusting the speed with the pps option. We should be able to handle close to 10 Gbps if netmap-libpcap is working correctly. I think you are correct, it's most likely a problem with libpcap. It's possible an older libpcap is being used and not netmap-libpcap. I'm not sure how that would happen though, since we had uninstalled libpcap before installing netmap-libpcap. – user2471905 Mar 20 '15 at 19:59
  • Try modifying the netmap-libpcap pcap.c file to have the `pcap_lib_version()` routine return a string that includes "netmap"; that way, the printf will indicate whether you're using netmap-libpcap or the standard version of libpcap, which would use PF_PACKET sockets rather than netmap. –  Mar 21 '15 at 19:56
  • (For example, modify the Makefile.in rule that generates version.h to do `sed -e 's/.*/static const char pcap_version_string[] = "netmap-libpcap version &";/' > $@`, and then re-run the configure script, rebuild netmap-libpcap, reinstall it, and rebuild your program with it. If it reports just "libpcap version 1.1.1", it's not netmap-libpcap; if it reports "netmap-libpcap version 1.1.1", it is.) –  Mar 21 '15 at 20:04
  • We ended up doing a fresh install of Ubuntu 14.04. We only installed netmap-libpcap, so we are sure it's the only version being used. However, we are still losing packets as before. Now we are trying to receive packets with PF_RING, but we lose the same number of packets. I'm not sure where to go from here. – user2471905 Apr 10 '15 at 06:23