1

My question is very similar to this question here, except the solution with catching didn't quite work for me.

Problem: I'm using multiprocessing to handle a file in parallel. Around 97%, it works. However, sometimes, the parent process will idle forever and CPU usage shows 0.

Here is a simplified version of my code

from PIL import Image
import imageio
from multiprocessing import Process, Manager

def split_ranges(min_n, max_n, chunks=4):
    chunksize = ((max_n - min_n) / chunks) + 1
    return [range(x, min(max_n-1, x+chunksize)) for x in range(min_n, max_n, chunksize)]

def handle_file(file_list, vid, main_array):
    for index in file_list:
        try:
            #Do Stuff
            valid_frame = Image.fromarray(vid.get_data(index))
            main_array[index] = 1
        except:
            main_array[index] = 0

def main(file_path):
    mp_manager = Manager()
    vid = imageio.get_reader(file_path, 'ffmpeg')
    num_frames = vid._meta['nframes'] - 1

    list_collector = mp_manager.list(range(num_frames)) #initialize a list as the size of number of frames in the video

    total_list = split_ranges(10, min(200, num_frames), 4) #some arbitrary numbers between 0 and num_frames of video

    processes = []
    file_readers = []

    for split_list in total_list:
        video = imageio.get_reader(file_path, 'ffmpeg')
        proc = Process(target=handle_file, args=(split_list, video, list_collector))
        print "Started Process" #Always gets printed
        proc.Daemon = False
        proc.start()
        processes.append(proc)
        file_readers.append(video)

    for i, proc in enumerate(processes):
        proc.join()
        print "Join Process " + str(i) #Doesn't get printed
        fd = file_readers[i]
        fd.close()

    return list_collector

The issue is that I can see the processes starting and I can see that all of the items are being handled. However, sometimes, the processes don't rejoin. When I check back, only the parent process is there but it's idling as if it's waiting for something. None of the child processes are there, but I don't think join is called because my print statement doesn't show up.

My hypothesis is that this happens to videos with a lot of broken frames. However, it's a bit hard to reproduce this error because it rarely occurs.

EDIT: Code should be valid now. Trying to find a file that can reproduce this error.

Nytrox
  • 53
  • 5
  • Could you make your simplified version of your code actually run and reproduce the problem? That would be helpful. – rrauenza May 24 '17 at 20:38
  • At least `fd = open(file)` seems to be wrong. A [mcve] would be preferable. – J.J. Hakala May 24 '17 at 20:46
  • Yeah I'll try to make a more working version. – Nytrox May 24 '17 at 21:24
  • Unfortunately, I don't think I can make a complete verifiable example because this error occurs rarely. – Nytrox May 24 '17 at 21:40
  • Hmm, but you could make a version that could be run, right? Because this seems to be missing some details, and it might be there where your possible error source is. So just make an example, that produces the "bad" result for you. And we could potentially run it on our own videos. – fbence May 24 '17 at 21:54
  • Can you find out the process state when you see that he's iddle? (ps command). To know if it's a zombie or what. – Robert May 24 '17 at 21:54
  • The process state is not zombie. It is "Sl" – Nytrox May 24 '17 at 22:08
  • The code should be working, but I can't seem to find a file that breaks it for now. Will update when I find one. – Nytrox May 25 '17 at 17:57
  • I have determined that the crash is not determined by the video. I ran it on a video and it idled, then ran it again at a later time and it worked...I'm honestly pretty lost right now. – Nytrox May 26 '17 at 18:51

0 Answers0