Why Optuna can't reproduce my LGBM result in the for loop?

Question

I have a simple training task that required me to rolling training my model, which means that I need to use previous 12 months data to predict the next month label, and I will rerun the model every month.

I wrote something like:

params_dict = {}
faetures_dict = {}
current_params = np.nan
sampler = TPESampler(seed=10) # I set the seed as mentioned in documentation
for i in tqdm(range(len(trainSchecule))): 
    # this is just a list of dates
    temp_dt_range = trainSchecule.loc[i-period:i-1]['TRADE_DT'].tolist()
    # this is a list of dataframe contains data of each month
    temp_df = [allDataDict[x] for x in temp_dt_range]
                
    study = optuna.create_study(direction="maximize", sampler=sampler)
    study.optimize(lambda trial: objective(trial, temp_df), n_trials=100)
        
    params_dict[trainSchecule.loc[i,'TRADE_DT']] = study.best_trial.params
    current_params = study.best_trial.params`

the code works fine, but when I manually set for example i = 10 and run

temp_dt_range = trainSchecule.loc[i-period:i-1]['TRADE_DT'].tolist()
temp_df = [allDataDict[x] for x in temp_dt_range]
                
study = optuna.create_study(direction="maximize", sampler=sampler)
study.optimize(lambda trial: objective(trial, temp_df), n_trials=100)
        
params_dict[trainSchecule.loc[i,'TRADE_DT']] = study.best_trial.params
current_params = study.best_trial.params

the study.best_trial.params gives different result compared to the results produced in the loop. However, when I rerun the loop, the result is reproducable.

Does anyone know why this happened?

I have tried sampler = TPESampler(seed=10) but it doesn't work.

Why Optuna can't reproduce my LGBM result in the for loop?

0 Answers0