In a python3/mininet script I have a tested, valid dictionary of host machines and their IP addresses.
For each of the keys - by iterating dictOfAllHostsAndIPs.keys()
- I execute a script on each emulated host's terminal
for host in dictOfAllHostsAndIPs.keys():
host.cmd(os.system( "python3 ./traffic_generator.py %s" % <my_args>))
This script that sends random packet to IP addresses randomly picked from the list of IPs on the emulated network -- it generates traffic quite randomly using echo to pipe some data to netcat that hands it out on a port that is normally not in use :
os.system("echo -n '%s'" % data_string + " | nc %s 1299" % str(random_ip))
)
My issue is that when I launch packet sniffers (mostly tshark) on each and every switch on the network, in the resulting logs in the TCP frames data I see on the first line that they all come from my VM's IP (192.168.119.133) instead of the host machines (10.0.0.XX), and in the ethernet/source line I always get the MAC address of the VM itself instead of the virtual machines.
# frame# /time(epoch) / source ip → dest ip / ports: src → dest
> 8 0.063202633 192.168.119.133 → 10.0.0.1 TCP 74 40370 → 1299
> Ethernet II, Src: VMware_34:8e:de (00:0c:29:34:8e:de), Dst:
> VMware_fa:1a:ac (00:50:56:fa:1a:ac)
> Destination: VMware_fa:1a:ac (00:50:56:fa:1a:ac)
> Address: VMware_fa:1a:ac (00:50:56:fa:1a:ac)
>
> Source: VMware_34:8e:de (00:0c:29:34:8e:de)
> Address: VMware_34:8e:de (00:0c:29:34:8e:de)
The only thing that changes in the source section of the frames I see in my logs is the port from which the VM is sending the packets, which is always between 35000 and 49999 (or maybe 50k and something). (see first line of the log above, ports: src
)
I initially assumed that each one of the ports being used was dedicated to sending traffic from the different emulated host machine, but they're way too many and they don't stay the same, so this is clearly not the case.
Not having realized this right away (everything else looked and still looks okay: packet destination and every other parameter I need to check for my research project), I have then implemented a modular, threaded refactoring of my code. So now I look at the logs of the different versions of my program and they all have the same issue, no matter how I run the traffic generation script (older versions) or the instantiation of trafficGenerator objects (newer versions), so this issue is not dependent on this aspect of my scripts' structure.
I checked on and fiddled with my calls many times, quadruple-checked the lists (then dicts, then lists again) that I iterate through in order to have the individual machines run the traffic generation code with for hosts in list(net.get(hosts)): host.cmd(<action>)
The latest version goes like this:
listOfTGs = []
### Create a list of threads for parallel execution
listOfThreads = []
for host in dictOfAllHostsAndIPs.keys():
### Create a trafficGenerator object on the mininet machine, then call commands on host
listOfTGs.append(tg.TrafficGenerator(durationFromArgument, host.name, callable_ips_list, net))
for tg_instance in listOfTGs:
listOfThreads.append(tg_instance.threadedExecution())
print(listOfThreads)
for th in listOfThreads:
th.start()
I can confirm (after maaaany careful print-based debugging hours) that the action of generating traffic, as all other actions, is executed once for each emulated host machine, in parallel, with the correct identity for each host visible in the stdout. And still, the IP that I get when sniffing the packets is the one of the VM itself instead of the one of the host machine that should appear. And in papers from other researchers I see that they have it correctly in their sniffing analysis, so this is not just "something that mininet does this way" (I looked into it in order to be sure).
TL;DR: my problem is that the source of the packets being sent around is the VM itself on which I launch the mininet emulations, and not the host machines being emulated in whose name I call the traffic generation processes.