2

I'm trying to perform about 100k GET requests and parse the response body of each request. I thought grequests would be a good way to go, but I'm getting errors related to 'too many open files'. Here's the code:

import grequests

with open("./100k-sites.csv", "r") as f:
    urls = ["http://" + line.rstrip() for line in f]

rs = (grequests.get(u, timeout=1) for u in urls)
responses = grequests.map(rs)

for r in responses:
    try:
        # do something with the response body
    except:
        pass

anyone got experience with this? The error I'm getting is:

requests.packages.urllib3.connection.HTTPConnection object at 0x7f817ab36898>: Failed to establish a new connection [Errno 24] Too many open files

  • There's a rather [long discussion](https://github.com/requests/requests/issues/239) on Github, though without a real fix as far as I can see. – z80crew Apr 12 '18 at 16:27

2 Answers2

1

Maybe it's only a workaround (as somebody in the discussion mentioned above says), but IMHO it's worth of writing here, that one can fix it by the two lines:

import resource
resource.setrlimit(resource.RLIMIT_NOFILE, (110000, 110000))
xhancar
  • 687
  • 1
  • 6
  • 14
0

Use imap instead of map

for resp in grequests.imap(rs, size=20):
    pass

and then you will not have such problems with processes and memory. but keep in mind that imap returns a generator

Exord
  • 11
  • 5