0

I have to get the name of all the different protocols from PCAP files. Basically, I have to parse it. I researched a bit and was informed that dpkt is very efficient for this. I am writing the script in python and below is the code -

def inet_to_str(inet):

# First try ipv4 and then ipv6
try:
    return socket.inet_ntop(socket.AF_INET, inet)
except ValueError:
    return socket.inet_ntop(socket.AF_INET6, inet)

def read_packet(pcap):
with open('/XYZ/XYZ/XYZ/XYZ/XYZ/' + str(pcap), 
"rb") as f:
    pcap = dpkt.pcap.Reader(f)
    for timestamp, buf in pcap:

        #Not printing out the timestamp for now
        #print('Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp)))

        #Unpacking the ethernet frame
        eth = dpkt.ethernet.Ethernet(buf)

        #Not printing the ethernet frame
        #print('Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.type)

        #Making sure the ethernet packet contains an IP packet
        if not isinstance(eth.data, dpkt.ip.IP):
            print('Non IP Packet type not supported %s\n' % eth.data.__class__.__name__)
            continue

        #Now unpack the data within the Ethernet frame (the IP packet)
        #Pulling out src, dst, length, fragment info, TTL, and Protocol
        ip = eth.data
        #dp = ip.data
        #proto = type(udp.data)
        #print(proto)
        time.sleep(3)
        # Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
        do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
        more_fragments = bool(ip.off & dpkt.ip.IP_MF)
        fragment_offset = ip.off & dpkt.ip.IP_OFFMASK

        # Print out the info
        print('IP: %s -> %s   (len=%d ttl=%d DF=%d MF=%d offset=%d) Protocol=%s\n' % \
        (inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset, ip.get_proto(ip.p).__name__))
        time.sleep(5)

The problem is that code gives me the transport layer protocol (TCP/UDP), but not the application layer protocols(SSH, DHCP, DNS etc). I read the documentation and found out that there are modules to analyze different types of packets if you know them, but I want to do it automatically as I have millions of pcap files. I want to automatically identify the application layer protocol and then call an appropriate function to analyze it. Is there a way I can at least get the name of protocols?

Mr.X
  • 59
  • 5

1 Answers1

0

You can look at the ports of the transport layer packets. This way you can figure out which application layer protocol is being used. Here is a link to the ports numbers and the application layer protocol that uses it: https://en.wikibooks.org/wiki/A-level_Computing/AQA/Paper_2/Fundamentals_of_communication_and_networking/Standard_application_layer_protocols

Using dpkt you can add to your code:

ip = eth.data
tcp = ip.data

If you are looking for a specific protocal you can set up if statements like so:

if tcp.dport == 80:
   print("Application Layer protocol used: HTTP")
Gero M
  • 3
  • 4
  • The problem is that it is possible that a few servers might change the default ports and I'll miss out on data. Thank you for the suggestion though. – Mr.X Oct 23 '22 at 20:43