1

I have a list of 10k ips and I need to get their FQDN. Doing this synchronously takes ages, so I tried coding it asynchronously, but I don't see any difference in execution times.

Synchronous method:

def test_synch():
    start_time = time.time()
    for ip in ip_list:
        fqdn = socket.getfqdn(ip)
        print(fqdn)
    print("Time for synchronous requests: ", time.time()-start_time)

Execution time: 284 seconds for 100 ip addresses

Asynchronous method:

async def get_fqdn_async(ip):
    return socket.getfqdn(ip)


async def get_fqdn(ip):
    print("executed task for ip", ip)
    fqdn = await get_fqdn_async(ip)
    print("got fqdn ", fqdn, " for ip ", ip)
    return fqdn


async def main():
    tasks = []
    for ip in ip_list:
        task = asyncio.create_task(
            get_fqdn(ip))
        tasks.append(task)

    fqdns = await asyncio.gather(*tasks)
    print(fqdns)

def test_asynch():
    start_time = time.time()
    asyncio.run(main())
    print("Time for asynchornous requests: ", time.time()-start_time)

Execution time: 283 seconds for 100 ips

Obviously I am doing something wrong, but I can't figure out what.

Sederfo
  • 177
  • 1
  • 11

1 Answers1

1

, Seems to me that multithreading would be ideal here. Consider this:

from concurrent.futures import ThreadPoolExecutor
import socket
import json

list_of_ips = ['www.google.com', 'www.bbc.co.uk', 'www.tripdavisor.com', 'www.stackoverflow.com', 'www.facebook.com']

def getfqdn(ip):
    return ip, socket.getfqdn(ip)

results = dict()
with ThreadPoolExecutor() as executor:
    for future in [executor.submit(getfqdn, ip) for ip in set(list_of_ips)]:
        ip, fqdn = future.result()
        results[ip] = fqdn

with open('report.json', 'w') as j:
    json.dump(results, j, indent=4)
DarkKnight
  • 19,739
  • 3
  • 6
  • 22
  • How would I access the ip variable inside the for loop? # get fqdns for source ips with ThreadPoolExecutor() as executor: for future in [executor.submit(socket.getfqdn, ip) for ip in source_ips if ip not in ip_to_fqdn_mapping]: fqdn = future.result() ip_to_fqdn_mapping[ip] = fqdn # this gives syntax error, ip is not defined print(fqdn) – Sederfo Dec 17 '21 at 17:46
  • You seem to have changed your requirement - ip_to_fqdn_mapping wasn't in the original question. Perhaps consider asking a new question. Also, there's something not quite right on your platform. You say you can run my code for 100 IPs in 50 seconds yet I can run 10,000 in just over 6 seconds. Are you on dial-up? – DarkKnight Dec 17 '21 at 17:51
  • I am on VPN. I will try this code on a company's server. I am using ip_to_fqdn_mapping to build a json containing a map between ip address and fqdn so I don't make requests twice. – Sederfo Dec 17 '21 at 17:55
  • You don't need a dictionary for that. I've made a small change to the code which will suffice – DarkKnight Dec 17 '21 at 18:00
  • Thank you! And yes, on the servers, it resolved 56k ip addresses in 26 seconds! – Sederfo Dec 17 '21 at 18:03
  • I am planning to still use that JSON dictionary that I am dumping in a file at the end of the script and loading each the time script runs, since it will need to be run a few times a day to generate some reports. That is why I wanted to use the dictionary, to store Ip addresses I already resolved. – Sederfo Dec 17 '21 at 18:07
  • OK - I've made a further edit which should help you – DarkKnight Dec 17 '21 at 18:15