1

Trying to download data using below HTTP request. This request would be made sequentially a few thousand times.

with urllib.request.urlopen(url, timeout=120) as resp:
    with open(save_loc + '.part', 'wb') as fh:
        while True:
            chunk = resp.read(1024 * 1024)
            if not chunk:
                break
            fh.write(chunk)

This is being called by:

if __name__ == '__main__':
x = [str(x) for x in range(1,100)]
with Pool(initializer=init_worker, processes=1) as pool:
    result = pool.map(downloadData,x, chunksize=1)
    pool.close()
    pool.join()

Download functionality would be scaled up in the future which is why I have used the code for multiprocessing.

Workarounds that I've come across for such errors is to either increase the open files limit using

ulimit -n [limit]

Or to open the file using "with" statement.

I'm trying to understand why there are instances of files open (as the error suggests) when I am using the "with" statement which automatically closes the file handler.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
Pavan Ajit
  • 11
  • 3
  • I don't see any obvious source of file handle leaks here. But I also don't see an [MCVE]; I see isolated bits of code that may, or may not, contain the problem (is the first bit of code `downloadData`? If so, what is `init_worker`? Is there any other relevant code omitted?). We need a complete [MCVE] to give useful advice. – ShadowRanger Jul 23 '20 at 04:54

0 Answers0