1

Is it possible to fit a scikit-learn model in parallel? Something along the lines of model.fit(X, y, n_jobs=20)

David Stein
  • 379
  • 2
  • 15

1 Answers1

2

It really depends on the model you are trying to fit. Usually it will have an n_jobs parameter when you initialize the model. See glossary on n_jobs. For example random forest:

from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_jobs=10)

If it is an ensemble method, it makes sense to parallelize because you can fit models separately (see help page for ensemble methods). LogisticRegression() also has an n_job option but I honestly don't know how much this speeds up the fitting process, if that's your bottle neck. See also this post

Other methods like elastic net, linear regression or SVM, i don't think there's a parallelization option.

StupidWolf
  • 45,075
  • 17
  • 40
  • 72