I have to get the name of all the different protocols from PCAP files. Basically, I have to parse it. I researched a bit and was informed that dpkt is very efficient for this. I am writing the script in python and below is the code -
def inet_to_str(inet):
# First try ipv4 and then ipv6
try:
return socket.inet_ntop(socket.AF_INET, inet)
except ValueError:
return socket.inet_ntop(socket.AF_INET6, inet)
def read_packet(pcap):
with open('/XYZ/XYZ/XYZ/XYZ/XYZ/' + str(pcap),
"rb") as f:
pcap = dpkt.pcap.Reader(f)
for timestamp, buf in pcap:
#Not printing out the timestamp for now
#print('Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp)))
#Unpacking the ethernet frame
eth = dpkt.ethernet.Ethernet(buf)
#Not printing the ethernet frame
#print('Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.type)
#Making sure the ethernet packet contains an IP packet
if not isinstance(eth.data, dpkt.ip.IP):
print('Non IP Packet type not supported %s\n' % eth.data.__class__.__name__)
continue
#Now unpack the data within the Ethernet frame (the IP packet)
#Pulling out src, dst, length, fragment info, TTL, and Protocol
ip = eth.data
#dp = ip.data
#proto = type(udp.data)
#print(proto)
time.sleep(3)
# Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
more_fragments = bool(ip.off & dpkt.ip.IP_MF)
fragment_offset = ip.off & dpkt.ip.IP_OFFMASK
# Print out the info
print('IP: %s -> %s (len=%d ttl=%d DF=%d MF=%d offset=%d) Protocol=%s\n' % \
(inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset, ip.get_proto(ip.p).__name__))
time.sleep(5)
The problem is that code gives me the transport layer protocol (TCP/UDP), but not the application layer protocols(SSH, DHCP, DNS etc). I read the documentation and found out that there are modules to analyze different types of packets if you know them, but I want to do it automatically as I have millions of pcap files. I want to automatically identify the application layer protocol and then call an appropriate function to analyze it. Is there a way I can at least get the name of protocols?