I have a simple training task that required me to rolling training my model, which means that I need to use previous 12 months data to predict the next month label, and I will rerun the model every month.
I wrote something like:
params_dict = {}
faetures_dict = {}
current_params = np.nan
sampler = TPESampler(seed=10) # I set the seed as mentioned in documentation
for i in tqdm(range(len(trainSchecule))):
# this is just a list of dates
temp_dt_range = trainSchecule.loc[i-period:i-1]['TRADE_DT'].tolist()
# this is a list of dataframe contains data of each month
temp_df = [allDataDict[x] for x in temp_dt_range]
study = optuna.create_study(direction="maximize", sampler=sampler)
study.optimize(lambda trial: objective(trial, temp_df), n_trials=100)
params_dict[trainSchecule.loc[i,'TRADE_DT']] = study.best_trial.params
current_params = study.best_trial.params`
the code works fine, but when I manually set for example i = 10 and run
temp_dt_range = trainSchecule.loc[i-period:i-1]['TRADE_DT'].tolist()
temp_df = [allDataDict[x] for x in temp_dt_range]
study = optuna.create_study(direction="maximize", sampler=sampler)
study.optimize(lambda trial: objective(trial, temp_df), n_trials=100)
params_dict[trainSchecule.loc[i,'TRADE_DT']] = study.best_trial.params
current_params = study.best_trial.params
the study.best_trial.params
gives different result compared to the results produced in the loop.
However, when I rerun the loop, the result is reproducable.
Does anyone know why this happened?
I have tried sampler = TPESampler(seed=10)
but it doesn't work.