19

The Python documentation has examples in the format of

with Pool() as p:
    p.map(do)

but I see a lot of people using the format below.

p = Pool()
p.map(do)
p.close()
p.join()

Which is more desirable?

Nikolas Stevenson-Molnar
  • 4,235
  • 1
  • 22
  • 31
Seung
  • 763
  • 1
  • 7
  • 19

1 Answers1

22

I think using Pool as a context manager (e.g., with ...) is desirable. It's a newer addition to Pool, and it lets you more cleanly encapsulate the lifespan of the pool.

One thing to be aware of is, that when the context manager exits, it will terminate the pool and any ongoing tasks. This means that you still want to do p.join() in some cases. Your example doesn't require this, because p.map will block execution until the task is done anyway:

A parallel equivalent of the map() built-in function (it supports only one iterable argument though). It blocks until the result is ready.

https://docs.python.org/3.7/library/multiprocessing.html#multiprocessing.pool.Pool.map

Therefore, in the second example, the call to .join() is unnecessary, as .map() will block until all tasks have completed.

However, using .map_async would make .join useful:

with Pool() as p:
    p.map_async(do_something, range(100))
    # Do something else while tasks are running
    p.close()
    p.join()

Edit: as Facundo Olano points out, .close() must always be called before .join(), as stated in the docs:

Wait for the worker processes to exit. One must call close() or terminate() before using join().

https://docs.python.org/3.7/library/multiprocessing.html#multiprocessing.pool.Pool.join

Nikolas Stevenson-Molnar
  • 4,235
  • 1
  • 22
  • 31
  • 5
    I think `p.close()` is also necessary in the async example, the doc says "One must call close() or terminate() before using join()", and I can confirm there's an error if I attempt joining before closing in python 3.6 – Facundo Olano Sep 17 '20 at 14:20
  • @FacundoOlano, thanks for catching this! I've fixed my async example above. – Nikolas Stevenson-Molnar Sep 17 '20 at 16:51
  • Dear all, one thing that kept me wondering was whether one should call "close" and "join" within the Pool context manager, or outside. Would someone elucidate it further? Sincerely – Philipe Riskalla Leal Jun 24 '21 at 22:31
  • 1
    @PhilipeRiskallaLeal it has to be within the context. p will no longer exist after the code exits "with" context – Thang Do Oct 05 '21 at 03:45
  • @ThangDo This isn't actually the case. Try it out in the interpreter: `p` is still in scope after the context manager block. – Nikolas Stevenson-Molnar Oct 05 '21 at 04:36
  • @ThangDo Though you are correct that `.join()` should be called inside the context, because when the context exits, it stops all tasks even if they're not completed. – Nikolas Stevenson-Molnar Oct 05 '21 at 04:39