How correctly choose number of jobs for estimator and validation?

Asked Nov 15 '22 at 06:22

Active Nov 15 '22 at 06:48

Viewed 104 times

I have classification problem to solve and use different classificators to solve the task. I use cross_val_score and cross_val_predict for validation and prediction. Both of them and estimator, e.g. LGBMClassifier support parallelizing. I have 46 physical cores, each of which has 2 logical cores, so 92 total. How should I set up n_jobs parameter in all functions to achieve best perfomance?

from lightgbm import LGBMClassifier
from sklearn.model_selection import cross_val_score, cross_val_predict

# X, y = predefined data
model = LGBMClassifier(n_estimators = 100, tree_learner='feature', n_jobs = ???)

score = cross_val_score(model, X, y, cv=5, n_jobs = ???)

My guess is that n_jobs of estimator should depend on parallelization technique, e.g. for feature case it should be equal to feature number. And as for validation, it probably should depend on number of folds. But it is only a guess. Is there sertified answer?

asked Nov 15 '22 at 06:22

Nourless

How correctly choose number of jobs for estimator and validation?

0 Answers0