0

Is it possible to run, in a qt application, without freezing the gui, let's say a sklearn gird search that use several jobs parallel (n_jobs > 1)? The problem is that joblib that is used for parallelizing sklearn code cannot run multiprocess into a thread.

For example, I'm using Gridsearch to find the best parameters for a svr, which is quite computionnaly intensive.

This question has been asked several times, but no solution found:

pyqt5-run-sklearn-calculations-on-a-separate-qthread, suggest the use of QProcess ?

multiprocessing-backed-parallel-loops-cannot-be-nested-below-threads,the threading.current_thread().name = 'MainThread' workaround does not work after the issue has been fixed

joblib-parallel-uses-only-one-core-if-started-from-qthread, rewrite the task using multiprocessing.Pool(processes=4). This method is not applicable for gridsearch embed njobs.

use sklearn cross validation train, in PyQt button, No aswers...

And any insight why this is purposely not supported (it a feature) ? It seems like it something that would be quite useful ?

beesleep
  • 1,322
  • 8
  • 18
  • 1
    I am not an expert in scikit learn but I work a lot with pyqt and relate it to Qt, I could try to give you a solution but I do not want to learn to use *scikit learn* now, you could provide the script (with the appropriate entries and complete imports) to avoid that work. – eyllanesc Nov 22 '18 at 00:48

1 Answers1

0

From my understanding of the issue, the problem resides with the default backend used by joblib, namely loky.

After some digging through the joblib and sklearn documentation, I resolved my issue by switching the joblib backend to threading. Note, the call to register_parallel_backend lies outside the __init__ function.

from sklearn.utils import parallel_backend, register_parallel_backend
from joblib._parallel_backends import ThreadingBackend

class ModelTrainer(QRunnable):
    register_parallel_backend('threading', ThreadingBackend, make_default=True)

    def __init__(self, **kwargs):
JoshTinker
  • 23
  • 5
  • Yes but threading it no the same type of parallelization. At the end, I chose to use dask instead. Working well with sklearn, with different processes. – beesleep Jul 09 '19 at 00:35