0

I'm trying to calculate the throughput from the output of tcpdump using python. So far I called tcpdump from python and managed to write the output in a text file. Sample output:

01:06:23.649464 0us IP (tos 0x0, ttl 128, id 63533, offset 0, flags [none], proto UDP (17), length 72) 251.146.199.137.1066 > 156.96.135.220.62827: UDP, length 44

01:06:23.920316 0us IP (tos 0x0, ttl 1, id 10354, offset 0, flags [none], proto IGMP (2), length 32, options (RA)) 251.146.198.120 > fm-dyn-140-0-193-221.fast.net.id: [|igmp]

However, I'm stuck on the next part. Extracting the time and length (the first one) and calculating the throughput. I'm new to python and don't have clear idea about regular expression. Also since the timestamps include micro second is there any easy method to work with them to calculate throughput?

Thanks in advance.

Bappy
  • 13
  • 4
  • possible duplicate of [Parsing large tcpdump files in python](http://stackoverflow.com/questions/14410580/parsing-large-tcpdump-files-in-python) – ivan_pozdeev Oct 29 '14 at 17:54
  • Nope. In this one, they're reading the *text* output from tcpdump; in the other question, they appear to be reading a pcap binary file, and, with the right Python package to read pcap files, getting the time stamps is pretty easy. Extracting it from the text involves a little more work. –  Oct 29 '14 at 21:35
  • @GuyHarris Exactly. To reliably parse the output, you need to use a documented format which the text output is not. – ivan_pozdeev Nov 06 '14 at 18:08
  • Well, the time stamp format, at least, is now documented; I just updated the tcpdump man page to document the time stamp format. (The rest isn't documented, and changes more often.) –  Nov 06 '14 at 23:32

1 Answers1

0

Forget about regex, you can use datetime module.

Using datetime

>>> from datetime import datetime
>>> lines = ['01:06:23.649464 0us IP (tos 0x0, ttl 128, id 63533, offset 0, flags [none], proto UDP (17), length 72) 251.146.199.137.1066 > 156.96.135.220.62827: UDP, length 44', '01:06:23.920316 0us IP (tos 0x0, ttl 1, id 10354, offset 0, flags [none], proto IGMP (2), length 32, options (RA)) 251.146.198.120 > fm-dyn-140-0-193-221.fast.net.id: [|igmp]']
>>> times = [datetime.strptime(line[:15], '%H:%M:%S.%f') for line in lines]

The throughput could be calculated directly, but you'll need to use strptime from datetime to that.

>>> times[1] - times[0]
datetime.timedelta(0, 0, 270852)
Mauro Baraldi
  • 6,346
  • 2
  • 32
  • 43