0

I'm trying to lookup about 10,000 domains via whois with the following code.

async def lookup(server, port, query, sema):
    async with sema as sema:
        try:
            reader, writer = await asyncio.open_connection(server, port)
        except:
            return {}
        writer.write(query.encode("ISO-8859-1"))
        await writer.drain()
        data = b""
        while True:
            d = await reader.read(4096)
            if not d:
                break
            data += d
        writer.close()
        data = data.decode("ISO-8859-1")
        return data

However I repeatedly get the error 'Connect Failed'. If I try a single lookup it goes through which means the whois server is up. I've also increased the ulimit to 10,000 but I'm limiting lookups to only a 1000 at a time with a semaphore.

Jonathan
  • 10,792
  • 5
  • 65
  • 85
  • 1
    Please don't use a blanket exception handler. You are playing Pokemon, but you really don't want to catch them all. Catch **specific exceptions only**. Because you don't want want to ignore `GeneratorExit` and `MemoryError` and `KeyboardInterrupt` and pretend they are connection errors. – Martijn Pieters Apr 27 '19 at 11:00
  • Have you tried lowering the number of connections to that whois server, so a lower limit on the semaphore? It could well be that the whois server admins are not liking 1000 connections from a single IP address at a time and are rate-limiting you? – Martijn Pieters Apr 27 '19 at 11:01

1 Answers1

1

The whois server is almost certainly rate limiting you. Not all whois servers are built to scale to 1000s of concurrent connections from a single IP address.

Limit your rate further, lower the semaphore limit or switch to a leaky bucket rate limiter.

Alternatively, find a whois API provider that offers higher query rate options, or better yet, supports bulk queries.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343