I'm writing a custom GridSearchCV function by calling StratifiedKFold from my multiprocess loop StratifiedKFold is giving the same accuracy n times, n = number of processes
import multiprocessing
PROCESSES = 5
with multiprocessing.Pool(PROCESSES) as pool:
params = iter(list(range(0,5))) #i'm not using a parameter for now
results = [pool.apply_async(computeKNN) for p in params]
for r in results:
print('\t', r.get())
with open("output.txt", "a") as outfile:
print(r.get(), file=outfile)
print(r.get())
def computeKNN():
kf = StratifiedKFold(n_splits=5, shuffle=True, random_state=None)
results={}
i=0
score = 0
best_score = 0
....
for train_index, val_index in kf.split(X, y):
I'm doing fit and getting accuracy score. in this example when running 5 processes, I get the same accuracy 5 times. Am I missing any parameters for StratifiedKFold? I assumed that I should get different results even if the code is run n times in parallel