The reason for this message, given the procedure in the referenced article, is the loop inside the objective()
function. In the first loop, the LightGBMPruningCallback
reports the intermediate iteration results, that are needed to decide whether the trial should be pruned, to Optuna. For the second run, the callback again reports values for iterations already reported. The error occurs, since for Optuna it seems that within a single trial, n (=number of folds) values of the LightGBMPruningCallback
for the very same iteration are recorded.
With iterations in mean the steps during gradient decent, so when Optuna typically reports decreasing loss values. Since the callback monitors and reports them, there shouldn't be redundant values reported for the very same iteration (1, 2, 3...)
You can fix this, by only calling the LightGBMPruningCallback
in the very first fold and not for the others, so with something like:
for idx, (train_idx, test_idx) in enumerate(cv.split(X, y)):
[...]
if idx == 0:
*calling the callback*
else:
*without calling the callback*
This should trigger the callback when training on the very first fold and should suppress further reports for further folds. The drawback, here is, that you determine whether a trial should be pruned only by looking at the training behavior on the first fold [...].
I personally don't use LightGBMPruningCallback
when performing cross-validation within an Optuna trial.
Hope this helps.