I'm using scapy to sniff a mirror port and generate a list of the top 10 "talkers", i.e. a list of hosts using the most bandwidth on my network. I'm aware of tools already available such as iftop and ntop, but I need more control over the output.
The following script samples traffic for 30 seconds and then prints a list of the top 10 talkers in the format "source host -> destination host: bytes". That's great, but how can I calculate average bytes per second?
I got the sense that changing sample_interval down to 1 second doesn't allow for a good sampling of traffic, so it seems I need to average it out. So I tried this at the end of the script:
bytes per second = (total bytes / sample_interval)
but the resulting Bytes/s seems much lower. For example, I generated an rsync between two hosts at a throttled rate of 1.5 MB/s, but using the above average calculation, my script kept calculating the rate between these hosts as around 200 KB/s...much lower than 1.5 MB/s as I'd expect. I can confirm with iftop that 1.5 MB/s is in fact the rate between these two hosts.
Am I totaling up packet lengths incorrectly with scapy (see traffic_monitor_callbak function)? Or is this a poor solution altogether :)?
from scapy.all import *
from collections import defaultdict
import socket
from pprint import pprint
from operator import itemgetter
sample_interval = 30 # how long to capture traffic, in seconds
# initialize traffic dict
traffic = defaultdict(list)
# return human readable units given bytes
def human(num):
for x in ['bytes','KB','MB','GB','TB']:
if num < 1024.0:
return "%3.1f %s" % (num, x)
num /= 1024.0
# callback function to process each packet
# get total packets for each source->destination combo
def traffic_monitor_callbak(pkt):
if IP in pkt:
src = pkt.sprintf("%IP.src%")
dst = pkt.sprintf("%IP.dst%")
size = pkt.sprintf("%IP.len%")
# initialize
if (src, dst) not in traffic:
traffic[(src, dst)] = 0
else:
traffic[(src, dst)] += int(size)
sniff(iface="eth1", prn=traffic_monitor_callbak, store=0, timeout=sample_interval)
# sort by total bytes, descending
traffic_sorted = sorted(traffic.iteritems(), key=itemgetter(1), reverse=True)
# print top 10 talkers
for x in range(0, 10):
src = traffic_sorted[x][0][0]
dst = traffic_sorted[x][0][1]
host_total = traffic_sorted[x][3]
# get hostname from IP
try:
src_hostname = socket.gethostbyaddr(src)
except:
src_hostname = src
try:
dst_hostname = socket.gethostbyaddr(dst)
except:
dst_hostname = dst
print "%s: %s (%s) -> %s (%s)" % (human(host_total), src_hostname[0], src, dst_hostname[0], dst)
I'm not sure if this is a programming (scapy/python) question or more of a general networking question, so I'm calling it a network programming question.