5

I am using joblib (wrapper of multiprocessing package) to run a loop over some function over enumerable of arguments. When I do htop I see the number of processes equal to the number of cpu count (n_jobs=-1 does that for you automatically). However, I also see that each process has as many threads as the cpu_count - 1... Is this expected? How come there is a second layer of parallelism and where does it come from?

Hanan Shteingart
  • 8,480
  • 10
  • 53
  • 66

1 Answers1

2

It seems to be a problem with a joblib backend called Loky which is used by joblib as a default one, I've had exactly the same problem and the performance tanked massively due to there being too many threads. In order to only use cores and not threads, you have to force joblib to use multiprocessing as backend in the following way:

from joblib import Parallel, delayed
my_list_of_results = Parallel(n_jobs=-1, backend="multiprocessing")(delayed(my_function)(my_stuff, ) for my_stuff in whatever)
Ivan Bilan
  • 2,379
  • 5
  • 38
  • 58