1

I'm developing a GUI which carries out some heavy number crunching. To speed up things I use joblib's Parallel execution together with pyqt's QThreads to avoid the GUI from becoming unresponsive. The Parallel execution works fine so far, but if embedded in the GUI and run in its own thread it utilizes only one of my 4 cores. Anything fundamental I missed in the threading/multiprocessing world?

Here a rough sketch of my setup:

 class ThreadRunner(QtCore.QObject):

    start = QtCore.pyqtSignal()
    result_finished = QtCore.pyqtSignal(np.ndarray)

    def __init__(self, function, *args, **kwargs):
        super(DispMapRunner, self).__init__()

        self.function = function
        self.args = args
        self.kwargs = kwargs
        self.start.connect(self.run)

    def run(self):
        print "New Thread started"
        result = self.function(*self.args, **self.kwargs)
        self.result_finished.emit(result)

class Gui(QtGui.QMainWindow, form_class):
    def __init__(self, cl_args, parent=None):
        super(Gui, self).__init__()
        #other stuff

    def start_thread(self, fun, *args, **kwargs):
        self.runner = ThreadRunner(fun, *args, **kwargs)
        self.thread = QtCore.QThread() 
        self.runner.moveToThread(self.thread)
        # more stuff for catching results

def slice_producer(data):
    n_images, rows, cols = data.shape[:3]
    for r in range(rows):
        yield np.copy(data[:,r,...])

    def run_parallel(data, *args, **kwargs):
        results = joblib.Parallel(
                    n_jobs=4,
                    verbose=12,
                    pre_dispatch='1.5*n_jobs'
                    )
                    (
                    delayed(
                    memory.cache(do_long_computation))
                    (slice, **kwargs) for slice in slice_producer(data)
                    )   

I hope it is not too long and at the same time too vague. I use pyqt4 4.11.3 and joblib 0.8.4.

I checked my code again and noticed the following warning:

UserWarning: Multiprocessing backed parallel loops cannot 
be nested below threads, setting n_jobs=1

Which refines my question to the following: How to run a multiprocessing process in a seperate thread?

mgutsche
  • 465
  • 1
  • 9
  • 20
  • Have you tried using a simple [multiprocessing pool](https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool) instead of joblib? – ekhumoro Aug 31 '15 at 16:32
  • I've read about it, but it was unclear to me, how to achieve a non-blocking execution of my function. The results of apply_async must be fetched somewhere and the GUI should not wait for it, but rather in a thread, right? – mgutsche Sep 01 '15 at 08:07
  • 1
    Threads are not really needed, since you could create a [custom event](http://doc.qt.io/qt-4.8/qevent.html#registerEventType) in the callback of `apply_async`, and then use [postEvent](http://doc.qt.io/qt-4.8/qcoreapplication.html#postEvent) to avoid blocking the gui. In fact, it might be possible to use this approach with joblib (which I'm not at all familiar with), and achieve similar results. – ekhumoro Sep 01 '15 at 15:28

1 Answers1

1

Okay, thanks to ekhumoro I arrived at something which works, uses only on instance of mp.pool and works with callbacks. Only drawback is, that errors in the child process fail silently (e.g. change results to result in f_wrapper). Here the code for future reference:

from PyQt4.QtCore import *
from PyQt4.QtGui import *
import multiprocessing
import sys
import numpy as np
import time

def f(data_slice, **kwargs):
    '''This is a time-intensive function, which we do not want to alter
    '''
    data = 0
    for row in range(data_slice.shape[0]):
        for col in range(data_slice.shape[1]):
            data += data_slice[row,col]**2
    time.sleep(0.1)
    return data, 3, 5, 3 # some dummy calculation results


def f_wrapper(row, data_slice,  **kwargs):
    results = f(data_slice, **kwargs)
    return row, results

class MainWindow(QMainWindow): #You can only add menus to QMainWindows

    def __init__(self):
        super(MainWindow, self).__init__()
        self.pool = multiprocessing.Pool(processes=4)

        button1 = QPushButton('Connect', self)
        button1.clicked.connect(self.apply_connection)
        self.text = QTextEdit()

        vbox1 = QVBoxLayout()
        vbox1.addWidget(button1)
        vbox1.addWidget(self.text)
        myframe = QFrame()
        myframe.setLayout(vbox1)

        self.setCentralWidget(myframe)
        self.show() #display and activate focus
        self.raise_()


    def apply_connection(self):
        self.rows_processed = list()
        self.max_size = 1000
        data = np.random.random(size = (100, self.max_size,self.max_size))
        kwargs = {'some_kwarg' : 1000}
        for row in range(data.shape[1]):
            slice = data[:,row, :]
            print "starting f for row ", row 
            result = self.pool.apply_async(f_wrapper, 
                                           args = (row, slice), 
                                           kwds = kwargs,
                                           callback=self.update_gui)
            #~ result.get() # blocks gui, but raises errors for debugging


    def update_gui(self, result):
        row, func_result = result
        self.rows_processed.append(row)
        print len(self.rows_processed)
        print func_result# or do something more intelligent
        self.text.append('Applied connection. Row = %d\n' % row)
        if len(self.rows_processed) == self.max_size:
            print "Done!" 




if __name__ == '__main__':
    app = QApplication(sys.argv)
    gui = MainWindow()
    app.exec_()

If there is a nice way to capture errors, that would be a nice bonus.

mgutsche
  • 465
  • 1
  • 9
  • 20