1

I am new to python multiprocess and I want to understand why my code does not terminate (maybe zombi or deadlock) and how to fix it. The createChain functions execute a for loop also and returns a tuple: (value1, value2). Inside createChain function there are other calls to other functions. I don't think posting the createChain function code will help because inside that function I am not doing something regarding multiprocess. I tried to make the processes as deamon but still didn't work. The strange think is that if I decrease the value of maxChains i.e to 500 or 100 is working.

I just want the process to do some heavy tasks and put the results to a data type.

My version of python is 2.7

def createTable(chainsPerCore, q, chainLength):

    for chain in xrange(chainsPerCore):
         q.put(createChain(chainLength, chain))


def initTable():
    maxChains = 1000
    chainLength = 10000
    resultsQueue = JoinableQueue()
    numOfCores = cpu_count()
    chainsPerCore = maxChains / numOfCores

    processes = [Process(target=createTable, args=(chainsPerCore, resultsQueue, chainLength,)) for x in range(numOfCores)]

    for p in processes:
        # p.daemon = True
        p.start()

    # Wait for hashing cores to finish
    for p in processes:
        p.join()

    resultsQueue.task_done()

    temp = [resultsQueue.get() for p in processes]
    print temp
Laxmana
  • 1,606
  • 2
  • 21
  • 40
  • Is it possible it just takes a really long time? is there any way you can get some kind of signal on how far along the processing is while it is running? – Tadhg McDonald-Jensen May 10 '16 at 14:30
  • It doesn't take that long. If I run it without multiprocess it terminates in 1 minutes. Also, when I am using multiprocess, I monitor the CPU and I am seeing the cores are running and after a couple of seconds they stop , so the work has terminated, but the program doesn't terminate. Thanks! – Laxmana May 10 '16 at 14:35
  • In that case I'd recommend adding `print` statements in between each of the steps (`p.start()` loop, `p.join()` loop, `resultsQueue.task_done()` and then `temp = ..`) and see which one it freezes at. – Tadhg McDonald-Jensen May 10 '16 at 14:40
  • I added the `print` statements. It freezes on the `p.join()`. Neither of the processes pass the `p.join()`. Should I post the rest of the code?? – Laxmana May 10 '16 at 14:47
  • maybe, but why is `resultsQueue` a **joinable** queue? Doesn't that mean that things can wait for it to finish before continuing, that would definitely cause a deadlock. – Tadhg McDonald-Jensen May 10 '16 at 15:07
  • It seems I don't understand the role of queues. Thanks for pointed that out. I just want the processes to put their results to an array. I don't want another process to do something with the data in the queue. So what should I do? Create an array, lock when a process wants to append and then unlock ? It is ok if the array is a global variable or should I pass it to the method? Thanks again. Now I understand better the problem. – Laxmana May 10 '16 at 15:30

1 Answers1

0

Based on the very useful comments of Tadhg McDonald-Jensen I understood better my needs and how the Queues are workings and for what purpose they should be used.

I change my code to

def initTable(output):
    maxChains = 1000

    results = []

    with closing(Pool(processes=8)) as pool:
        results = pool.map(createChain, xrange(maxChains))
        pool.terminate()
Laxmana
  • 1,606
  • 2
  • 21
  • 40