Why does my context manager exit function run before the computation isn't finished?

Question

The exit function of my custom context manager seemingly runs before the computation is done. My context manager is meant to simplify writing concurrent/parallel code. Here is my context manager code:

import time
from multiprocessing.dummy import Pool, cpu_count

class managed_pool:
    '''Simple context manager for multiprocessing.dummy.Pool'''
    def __init__(self, msg):
        self.msg = msg
    def __enter__(self):
        cores = cpu_count()
        print 'start concurrent ({0} cores): {1}'.format(cores, self.msg)
        self.start = time.time()
        self.pool = Pool(cores)
        return self.pool
    def __exit__(self, type_, value, traceback):
        print 'end concurrent:', self.msg
        print 'time:', time.time() - self.start
        self.pool.close()
        self.pool.join()

I've already tried this script with multiprocessing.Pool instead of multiprocessing.dummy.Pool and it seems to fail all the time.

Here is an example of using the context manager:

def read_engine_files(f):
    engine_input = engineInput()
    with open(f, 'rb') as f:
        engine_input.parse_from_string(f.read())
    return engine_input

with managed_pool('load input files') as pool:
    data = pool.map(read_engine_files, files)

So, inside of read_engine_files I print the name of the file. You'll notice in the __exit__ function that I also print out when the computation is done and how long it took. But when viewing stdout the __exit__ message appears way before the computation finished. Like, minutes before the computation is done. But htop says all of my cores are still being used. Here's an example of the output

start concurrent (4 cores): load engine input files
file1.pbin
file2.pbin
...
file16.pbin
end concurrent: load engine input files
time: 246.43829298
file17.pbin
...
file45.pbin

Why is __exit__ being called so early?

@TomDalton I added in a function for `read_engine_files`. Can someone explain why this is being down voted? — jamis, Apr 06 '17 at 16:50
It's probably being downvoted because you haven't provided a minimal example. Even with the extra code you've posted, now we can't see the `engineInput()` or `engine_input.parse_from_string(f.read())` code. — Tom Dalton, Apr 07 '17 at 07:22

score 1 · Answer 1 · answered Apr 06 '17 at 00:07

Are you sure you're just calling pool.map()? That should block until all the items have been mapped.

If you're calling one of the asynchronous methods of Pool, then you should be able to solve the problem by changing the order of things in __exit__(). Just join the pool before doing the summary.

def __exit__(self, type_, value, traceback):
    self.pool.close()
    self.pool.join()
    print 'end concurrent:', self.msg
    print 'time:', time.time() - self.start

score 0 · Answer 2 · answered Feb 25 '19 at 00:38

The most likely explanation is that an exception occurred. The above code sample does not parse the type, value or traceback arguments of the __exit__ statement. Thus, an exception occurs (and is not caught earlier), is handed to the exit statement which in turn does not react to it. The processes (or some of them) continue running.

Why does my context manager __exit__ function run before the computation isn't finished?

2 Answers2

Why does my context manager exit function run before the computation isn't finished?