2

When tuning parameters in Optuna, I have an invalid subspace in my space of possible parameters. In my particular case, two of the parameters that I'm tuning can cause extremely long trials (that I want to avoid) if they are both close to zero (< 1e-5), i.e.:

A > 1e-5 A < 1e-5
B > 1e-5 OK OK
B < 1e-5 OK TIMEOUT

I'm obviously able to catch this edge case when both A < 1e-5 and B < 1e-5, but how should I let Optuna know that this is an invalid trial? I don't want to change the sampling ranges for A and B to exclude values < 1e-5, as it is fine if only one of A and B is < 1e-5.

I have two ideas so far:

  1. Raise an Optuna pruning exception optuna.exceptions.TrialPruned. This would prune the trial before the code timed out, but I'm unsure if this tells Optuna that this is a bad area of the search space to evaluate. If it does guide the tuning away from this edge case, then I think this is the best option.

  2. Return some fixed trial score, e.g. 0. I know my trials will have a score between 0 and 1, therefore if this invalid edge case is reached, I could return the minimum possible score of 0. However, if most trial scores are 0.5 or greater, then the value of 0 for the edge case becomes an extreme outlier.

MWE:

import optuna


class MWETimeoutTuner:

    def __call__(self, trial):
        # Using a limit of 0.1 rather than 1e-5 so the edge case is triggered quicker
        lim = 0.1

        trial_a = trial.suggest_float('a', 0.0, 1.0)
        trial_b = trial.suggest_float('d', 0.0, 1.0)
        trial_c = trial.suggest_float('c', 0.0, 1.0)
        trial_d = trial.suggest_float('d', 0.0, 1.0)

        # Without this, we end up stuck in the infinite loop in _func_that_can_timeout
        #  But is pruning the trial the best way to way to avoid an invalid parameter configuration?
        if trial_a < lim and trial_b < lim:
            raise optuna.exceptions.TrialPruned

        def _func_that_can_timeout(a, b, c, d):
            # This mocks the timeout situation due to an invalid parameter configuration.
            if a < lim and b < lim:
                print('TIMEOUT:', a, b)
                while True:
                    pass

            # The maximum possible score would be 2 (c=1, d=1, a=0, b=0)
            #  However, as only one of a and b can be less than 0.1, the actual maximum is 1.9.
            #  Either (c=1, d=1, a=0, b=0.1) or (c=1, d=1, a=0.1, b=0)
            return c + d - a - b

        score = _func_that_can_timeout(trial_a, trial_b, trial_c, trial_d)
        return score


if __name__ == "__main__":
    tuner = MWETimeoutTuner()
    n_trials = 1000
    direction = 'maximize'
    study_uid = "MWETimeoutTest"
    study = optuna.create_study(direction=direction, study_name=study_uid)
    study.optimize(tuner, n_trials=n_trials)

I've found this related issue, which suggests changing the sampling process based on existing values that have been sampled. In the MWE, this would look like:

trial_a = trial.suggest_float('a', 0.0, 1.0)
if trial_a < lim:
    trial_b = trial.suggest_float('b', lim, 1.0)
else:
    trial_b = trial.suggest_float('b', 0.0, 1.0)

However, upon testing, this produces the following warning:

RuntimeWarning: Inconsistent parameter values for distribution with name "b"! This might be a configuration mistake. Optuna allows to call the same distribution with the same name more then once in a trial. When the parameter values are inconsistent optuna only uses the values of the first call and ignores all following. Using these values: {'low': 0.1, 'high': 1.0}.>

So this doesn't seem to be a valid solution.

In the MWE, raising pruning exception works, and (near) optimal values are found. It seems in writing this question I have almost answered it myself in that pruning is the way to go, unless there is a better solution?

DaBigJoe
  • 29
  • 2
  • 6

2 Answers2

0

Have a look at this optuna example with constraint using botorch sampler.

ferdy
  • 4,396
  • 2
  • 4
  • 16
0

I have been considering this as well. My use case is that I want to terminate trials where the objective is starting to explode (diverge/fail). I can't use the supplied pruners for this because I am already using a patient median pruner (thus I can't also use the threshold pruner at the same time, unless I define a custom pruner perhaps).

My code is something like this:

def objective(trial):
...
#Train loop:
for epoch in range(n_epochs):
  for X, y in train_loader:
    ...
    #Handle auto pruning:
    trial.report(validation_error_rate, epoch)
    if trial.should_prune():
      raise optuna.exceptions.TrialPruned()

   #Now handle MANUAL pruning
   #Prune trial if its validation_error_rate has blown up
   if validation_error_rate > 1e3: #astronomically high
     raise optuna.exceptions.TrialPruned()
   ...

I ran two studies, where the only difference was whether or not I included manual pruning. Both yielded a similar best_trial. The top 10 results were those without manual pruning applied. The remaining trials were better from the run that included manual pruning.

So it seems to me like manual pruning doesn't harm the best model found. It didn't always find a better model when I repeated the experiment and seems to negatively affect the sampler as its top 10 models are generally less good. Overall, I preferred the set of models when no manual pruning was used.

If the OP has any new information or insights please share.

Caveat: I only ran a couple of such comparisons, and did not control for random initialisation.

some3128
  • 1,430
  • 1
  • 2
  • 8
  • 1
    It's been a while since I looked into this, but I believe I ended up using manual pruning and increasing the number of trials. – DaBigJoe Jun 22 '23 at 09:00