-2

I'm using libpcap (pcap crate) and I want to reconstruct individual tcp flows from packets. I have to match those packets to a flow, in a way that works for packets going in both directions (client->server and server-client) with as little overhead as possible. A flow is identified by its 4-tuple (src-addr, dst-addr, src-port, dst-port).

I was wondering if XOR-ing the ip and port (src and dst) would be good enough to match the tcp flow. Basically something like this:

  • create a hashmap (flow_key, array-of-packets)
  • get the next packet from libpcap
  • extract src/dst ip and port
  • flow-key = hash( XOR(src-ip, dst-ip), XOR (src-port, dst-port) )
  • if flow_key is in the hashmap then add the packet to the hashmap value
  • if not, create a new entry in the hashmap
  • drop the entry when FIN or RST is observed and process the collected packets

This way, I would not care in which direction the packet is flowing and there is no need to implement more detailed connection tracking, dealing with connection state and etc (I assume).

I did some tests using rust, like the code below which seems to be working as expected but I'm not sure if the idea is valid or not:

use std::collections::hash_map::DefaultHasher;
use std::{net::Ipv4Addr, str::FromStr};
use std::hash::{Hash, Hasher};

#[derive(Debug, PartialEq)]
struct TestS {
    src_ip: Ipv4Addr,
    dst_ip: Ipv4Addr,
    src_port: u16,
    dst_port: u16
}

impl Hash for TestS {
    fn hash<H: Hasher>(&self, state: &mut H) {
        let hashable_ip = u32::from(self.src_ip) ^ u32::from(self.dst_ip);
        let hashable_port = self.src_port ^ self.dst_port;

        hashable_ip.hash(state);
        hashable_port.hash(state);
    }

}

fn main() {

    let a_packet = TestS{
        src_ip: Ipv4Addr::from_str("192.168.0.1").unwrap(),
        dst_ip: Ipv4Addr::from_str("127.0.0.1").unwrap(),
        src_port: 7879,
        dst_port:80
    };

    let another_packet = TestS{
        src_ip: Ipv4Addr::from_str("127.0.0.1").unwrap(),
        dst_ip: Ipv4Addr::from_str("192.168.0.1").unwrap(),
        src_port: 80,
        dst_port: 7879
    };

    let mut hash_state = DefaultHasher::new();
    a_packet.hash(&mut hash_state);
    let hash1 = hash_state.finish();

    let mut hash_state2 = DefaultHasher::new();
    another_packet.hash(&mut hash_state2);
    let hash2 = hash_state2.finish();

    println!("{} -- {}", hash1, hash2);
    //hash1 and hash2 will be equal

}
thek33per
  • 132
  • 4
  • 1
    So if I connect to port 80 from port 1024, get redirected to the HTTPS site, then connect to port 443 from port 1515, your algorithm will think its the same connection. Seems flaky to me, but we don't know if it's going to be acceptable for your use case. – Colonel Thirty Two Sep 18 '22 at 19:49

1 Answers1

0

Note that if you have streams with crossed ports (e.g. a stream between S:1 and C:2 and another stream between S:2 and C:1), then they will be taken as the same flow.

A more robust solution would be to sort the (IP, port) pairs prior to hashing:

let src_ip = u32::from (self.src_ip);
let dst_ip = u32::from (self.dst_ip)
let (ip1, port1) = min ((src_ip, src_port), (dst_ip, dst_port));
let (ip2, port2) = max ((src_ip, src_port), (dst_ip, dst_port));
ip1.hash (state);
ip2.hash (state);
port1.hash (state);
port2.hash (state);
Jmb
  • 18,893
  • 2
  • 28
  • 55