Is it possible to fit a scikit-learn model in parallel? Something along the lines of
model.fit(X, y, n_jobs=20)
Asked
Active
Viewed 1,292 times
1

David Stein
- 379
- 2
- 15
1 Answers
2
It really depends on the model you are trying to fit. Usually it will have an n_jobs
parameter when you initialize the model. See glossary on n_jobs. For example random forest:
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_jobs=10)
If it is an ensemble method, it makes sense to parallelize because you can fit models separately (see help page for ensemble methods). LogisticRegression() also has an n_job option but I honestly don't know how much this speeds up the fitting process, if that's your bottle neck. See also this post
Other methods like elastic net, linear regression or SVM, i don't think there's a parallelization option.

StupidWolf
- 45,075
- 17
- 40
- 72