Lost in libpcap - how to use setnonblock() / should I use pcap_dispatch() or pcap_next_ex() for realtime?

Question

I'm building a network sniffer that will run on a PFSense for monitoring an IPsec VPN. I'm compiling under FreeBSD 8.4.

I've chosen to use libpcap in C for the packet capture engine and Redis as the storage system. There will be hundreds of packets per second to handle, the system will run all day long.

The goal is to have a webpage showing graphs about network activity, with updates every minutes or couple of seconds if that's possible. When my sniffer will capture a packet, it'll determine its size, who (a geographical site, in our VPN context) sent it, to whom and when. Then those informations needs to be stored in the database.

I've done a lot of research, but I'm a little lost with libpcap, specifically with the way I should capture packets.

1) What function should I use to retrieve packets ? pcap_loop ? pcap_dispatch ? Or pcap_next_ex ? According to what I read online, loop and dispatch are blocking execution, so pcap_next_ex seems to be the solution.

2) When are we supposed to use pcap_setnonblock ? I mean with which capture function ? pcap_loop ? So if I use pcap_loop the execution won't be blocked ?

3) Is multi-threading the way to achieve this ? One thread running all the time capturing packets, analyzing them and storing some data in an array, and a second thread firing every minutes emptying this array ?

The more I think about it, the more I get lost, so please excuse me if I'm unclear and don't hesitate to ask me for precisions.

Any help is welcome.

Edit :

I'm currently trying to implement a worker pool, with the callback function only putting a new job in the job queue. Any help still welcome. I will post more details later.

score 3 · Accepted Answer · answered Mar 26 '15 at 20:00

1) What function should I use to retrieve packets ? pcap_loop ? pcap_dispatch ? Or pcap_next_ex ? According to what I read online, loop and dispatch are blocking execution, so pcap_next_ex seems to be the solution.

All of those functions block waiting for packets if there's no packet ready and you haven't put the pcap_t into non-blocking mode.

pcap_loop() contains a loop that runs indefinitely (or until pcap_breakloop() is called, and error occurs, or, if a count was specified, the count runs out). Thus, it may wait more than one time.

pcap_dispatch() blocks waiting for a batch of packets to arrive, if no packets are available, loops processing the batch, and then returns, so it only waits at most one time.

pcap_next() and pcap_next_ex() wait for a packet to be available and then returns it.

2) When are we supposed to use pcap_setnonblock ? I mean with which capture function ? pcap_loop ? So if I use pcap_loop the execution won't be blocked ?

No, it won't be, but that means that a call to pcap_loop() might return without processing any packets; you'll have to call it again to process any packets that arrive later. Non-blocking and pcap_loop() isn't really useful; you might as well use pcap_dispatch() or pcap_next_ex().

3) Is multi-threading the way to achieve this ? One thread running all the time capturing packets, analyzing them and storing some data in an array, and a second thread firing every minutes emptying this array ?

(The array would presumably be a queue, with the first thread appending packet data to the end of the queue and the second thread removing packet data from the head of the queue and putting it into the database.)

That would be one possibility, although I'm not sure whether it should be timer-based. Given that many packet capture mechanisms - including BPF, as used by *BSD and OS X - deliver packets in batches, perhaps you should have one loop that does something like

for (;;) {
    pcap_dispatch(p, -1, callback);
    wake up dumper thread;
}

(that's simplified - you might want to check for errors in the return value from pcap_dispatch()).

The callback would add the packet handed to it to the end of the queue.

The dumper thread would pull packets from the head of the queue and add them to the database.

This would not require that the pcap_t be non-blocking. On a single-CPU-core machine, it would rely on the threads being time-sliced by the scheduler.

I'm currently trying to implement a worker pool, with the callback function only putting a new job in the job queue.

Be aware that, once the callback function returns, neither the const struct pcap_pkthdr structure pointed to by its second argument nor the raw packet data pointed to by its third argument will be valid, so if you want to process them in the job, you will have to make copies of them - including a copy of all the packet data you will be processing in that job. You might want to consider doing some processing in the callback routine, even if it's only figuring out what packet data needs to be saved (e.g., IP source and destination address) and copying it, as that might be cheaper than copying the entire packet.

Thanks for your time ! I'm no longer working on this project for now, but when I will re-open the case your explanations will for sure be really helpful. — bdelphin, May 20 '15 at 17:57

Lost in libpcap - how to use setnonblock() / should I use pcap_dispatch() or pcap_next_ex() for realtime?

1 Answers1