I am using joblib (wrapper of multiprocessing package) to run a loop over some function over enumerable of arguments. When I do htop
I see the number of processes equal to the number of cpu count (n_jobs=-1
does that for you automatically). However, I also see that each process has as many threads as the cpu_count - 1
... Is this expected? How come there is a second layer of parallelism and where does it come from?
Asked
Active
Viewed 1,629 times
5

Hanan Shteingart
- 8,480
- 10
- 53
- 66
-
Did you find a solution to this problem? – Ivan Bilan Dec 11 '18 at 12:59
-
1nope. no one answered... – Hanan Shteingart Dec 12 '18 at 19:42
-
I found the solution eventually, see the answer below. – Ivan Bilan Dec 13 '18 at 08:00
1 Answers
2
It seems to be a problem with a joblib backend called Loky
which is used by joblib as a default one, I've had exactly the same problem and the performance tanked massively due to there being too many threads. In order to only use cores and not threads, you have to force joblib to use multiprocessing
as backend in the following way:
from joblib import Parallel, delayed
my_list_of_results = Parallel(n_jobs=-1, backend="multiprocessing")(delayed(my_function)(my_stuff, ) for my_stuff in whatever)

Ivan Bilan
- 2,379
- 5
- 38
- 58
-
can you please report an issue in joblib? I guess they should look into it. https://github.com/joblib/joblib/issues – Hanan Shteingart Dec 15 '18 at 19:50