1

I have a very complex tree structured search space. At the top level I need to make a categorical choice - which subspace of parameters to explore. As a simple example, you can imagine that I need to decide between using a linear regression, an SVM, or some neural network. Each subspace has a vastly different size and I would like Optuna sampler (I am thinking of using TPESampler) to spend more time exploring the larger spaces.

If I do

  trial.suggest_categorical("network_type", ["linear", "svm", "nn"])

and branch of depending on what is picked here, I will be exploring small subspaces more than what I want. Conceptually, if I could pass a weight for each category, it would solve my problem.

I can imagine asking for a random float and mapping it to the right category, but it has a few obvious issues including: it is harder to look at results; the sampler won't actually know that it is a categorical space (e.g. floats are are close by will represent completely different categories).

In this simple example, TPESampler will probably find which "network_type" works well and pick it more often. In my actual use case, I have more than a thousand categories.

Edit:

I appreciate knowing peoples' experiences that Optuna just worked well without much priors. I guess it is very problem dependent. When the search space is highly structured with multiple branches of O(1000) cardinality, even getting one sample from each leaf subspace can be pretty hard.

It might be that no automatic tuner would work on my problem, but I would still appreciate knowing the answer to the actual question. Can one somehow "guide" optuna's choices in categorical distributions?

iga
  • 3,571
  • 1
  • 12
  • 22

2 Answers2

0

It feels like you're trying to second guess a black-box optimizer. I recommend letting Optuna learn the spaces and look for promising answers.

In this simple example, TPESampler will probably find which "network_type" works well and pick it more often.

Right. Isn't finding values that work well what you want to do?

Personally, I have tried to "help" Optuna using the visualizations and manually reducing the search space. Generally, I found by the time I could see the better search areas, TPE already knew them as well, so I would have gotten a faster answer by just letting Optuna run.

0

The TPESampler takes an n_startup_trials argument that specifies how many random trials to perform before starting to predict good guesses. You can set this to half or three quarters of the trials you intend to perform to ensure the space is fairly evenly sampled before it starts to prioritize parameters that are predicted to be good. Also consider setting multivariate and group to True if your parameters aren't independent.

Permafacture
  • 410
  • 3
  • 13