So I'm writing a python script that takes a large amount of IPs in a set (~20k) and the goal is to resolve the hostname of each IP using gethostbyaddr. In an attempt to make the script run as fast as possible I implemented multithreading. Note that my computer running the script is on a wired ethernet connection. The issue is that while running the script, I've noticed that my wifi connection drops (on my phone, on every other household device) however the script will continue to run fine and once completed, wifi is restored. I've tweaked around with the amount of threads active at once and this is likely the issue. If I set the limit to say 500 threads, the script will complete in 2 minutes, but the wifi will be dropped. If I lower it down to <50 threads the wifi is fine, but the script will take ages to complete. As the limit goes higher, the wifi connection suffers until a point of just dropping. So I see that I am essentially DDoSing myself so my question is if there is a way to efficiently complete the script without bottlenecking my wifi. Or maybe there is something wrong with my implementation. My code is below:
def lookup_IP(ip):
result = tuple()
try:
host, aliases, _ = socket.gethostbyaddr(ip)
result = (ip, host)
except socket.herror:
result = (ip, 'no host found')
except Exception:
result = (ip, 'error finding host')
return result
with futures.ThreadPoolExecutor(500) as executor:
submitted_threads = {executor.submit(lookup_IP, ip): ip for ip in all_IPs}
for thread in futures.as_completed(submitted_threads):
try:
data = thread.result()
#add data to a dictionary
except Exception as err:
#print error
#rest of code