1

I need to process a 7GB pcap file to extract their packets size, payloads size. I initially use scapy's PcapReader to extract these sizes, but scapy run truly slow for 7GB file. So I change to use DPKT library, however, I don't know how to check its TCP payload size.

import dpkt
payload_size=[]
packet_size=[]

for ts,buf in dpkt.pcapng.Reader(open('pcap file','rb')):
    eth=dpkt.ethernet.Ethernet(buf) 
    if eth.type==dpkt.ethernet.ETH_TYPE_IP:
        ip=eth.data
        if ip.p==dpkt.ip.IP_PROTO_TCP:
            packet_size.append(ip.len)
            payload_size.append(?)
    else:
        pass

1 Answers1

0

Looking at the source for dpkt's IP class

    def __len__(self):
        return self.__hdr_len__ + len(self.opts) + len(self.data)

They calculate the length as the header, options, and length of data. So I think you can get the payload length by:

payload_size.append(len(ip.data))

Update:

OP wanted the TCP payload. The TCP's source is similar:

    def __len__(self):
        return self.__hdr_len__ + len(self.opts) + len(self.data)

So the length of the TCP payload should be len(ip.data.data).

if ip.p==dpkt.ip.IP_PROTO_TCP: 
   tcp = ip.data
   payload_len = len(tcp.data)
saquintes
  • 1,074
  • 3
  • 11
  • I think ``` len(ip.data)``` is the TCP payload length + TCP Header Length. Thus, the problem is how to determine Header length, then we can get TCP payload length by len(ip.data) - TCP header length – user14834847 Dec 16 '20 at 07:34
  • Same process, just another layer. Updated the answer. – saquintes Dec 16 '20 at 20:03