0

I am creating ProcessPoolExecutor and then submitting some tasks to be processed. ProcessPoolExecutor forks multiple processes to do the task. The issue I am facing is some of the forked processes are stuck with their tasks even after the ProcessPoolExecutor is shut down. So when I see a process from the command prompt, I can see multiple processes with the same COMMAND

executor = ProcessPoolExecutor()
futures = [executor.submit(self.create_models, a, b) for col in self.df.columns]
        for future in as_completed(futures):

            try:
                model, col = future.result()
                models[f'm_{col}'] = model
            except Exception:
                logger.exception('Unable to get the results')
                raise
        executor.shutdown(wait=True)
Programmer
  • 165
  • 1
  • 1
  • 7
  • You have an indentation error and you have not posted a [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). – Booboo Sep 25 '22 at 12:45
  • a MCVE would help, but I'd suggest using the executor as a context manager to perform the shutdown cleanly. I'd also use `.map` to submit / collect submit and collect results in one step. it also might be worth profiling to see where time is spent, based on your identifiers it's likely that a significant amount of time will be spent moving data to/from process via `pickle` – Sam Mason Sep 26 '22 at 16:17

0 Answers0