I am trying to learn more about parallelisation to speed up this classification code. I literally started reading about it less than 24 hours ago (to share some context). I am wondering which multiprocessing technique will be the best to tackle this problem and what sort of speed improvement could I expect. Lastly, suggestion on how to structure the code will be highly appreciated. I am currently looking into the ray, joblib and multiprocessing libraries.
def clf(i):
cal_probs = []
for i, intem in enumerate(price):
# cross validation strategy
cv = RepeatedStratifiedKFold(n_splits=5, n_repeats=3, random_state=1)
# Classifier
tune_clf = CalibratedClassifierCV(SVC(gamma='scale',
class_weight='balanced',
C=0.01), method="isotonic",
cv=cv).fit(X_train[[price[i],
'regime']], y_train[price[i]])
# Calibrated Probabilities
pred_probs = tune_clf.predict_proba(X[[price[i], 'regime']])
cal_probs.append(pred_probs)