3

Here is my program with multiprocessing for processing some images:

import multiprocessing as mp
import tqdm

def image_process(path):
    # ...
    return

def main():
    images = [# some images' dir path]
    result = []

    pool = mp.Pool(mp.cpu_count())
    for temp in tqdm.tqdm(pool.imap(images), total=len(images)):
        result.append(temp)

if __name__ = "__main__":
    main()

It works very well when I test it in PyCharm, no crash. But once I compile it with pyinstaller, and run it with the .exe file, it draws all my 32GB ram in a short time and then my Windows crash. The tqdm bar stays at 0 through out the run. It crashes so fast that I have no time to do any debug on it. I only found that the number of "Processes" in the "CPU" page of Windows Task Manager increase rapidly. Seems the program just keep creating new process without deleting those outdated.

Here are my trying:

pool.close()
pool.join

I read some questions of this problem on Stack Overflow, and tried to add these two line in the 'for' loop after append(), but the problem still exist.

Hui Gordon
  • 345
  • 1
  • 3
  • 13
  • 2
    for usage with pyinstaller or any other things that "freeze" a script into a windows executable, you should call `multiprocessing.freeze_support()` at the beginning of `if __name__ == "__main__":`. you should also specify `maxtasksperchild` in the `Pool` constructor to handle any resources not explicitly cleaned up by periodically re-starting each worker process. – Aaron Apr 13 '21 at 17:30
  • After adding `multiprocessing.freeze_support()`, it works well. Thank you for your quick and clear answer. By the way, if I set `maxtasksperchild=1`, is that means each worker process will just run the function _image_process()_ for once, then it will be exited and replaced by a new worker process? I am not so sure about the definition of _task_ of `maxtasksperchild`. – Hui Gordon Apr 14 '21 at 05:22
  • you are correct in the way `maxtasksperchild` works, and that using a value of `1` would be a little silly. depending on how long each task takes, and how long it takes to spin up a new child(which would take some profiling), I would personally try to adjust the max-tasks so the re-start overhead is in the neighborhood of 10% or less. – Aaron Apr 14 '21 at 20:45
  • I see, got it! I will take it as reference for my program. – Hui Gordon Apr 15 '21 at 08:45

0 Answers0