I have a very complex tree structured search space. At the top level I need to make a categorical choice - which subspace of parameters to explore. As a simple example, you can imagine that I need to decide between using a linear regression, an SVM, or some neural network. Each subspace has a vastly different size and I would like Optuna sampler (I am thinking of using TPESampler) to spend more time exploring the larger spaces.
If I do
trial.suggest_categorical("network_type", ["linear", "svm", "nn"])
and branch of depending on what is picked here, I will be exploring small subspaces more than what I want. Conceptually, if I could pass a weight
for each category, it would solve my problem.
I can imagine asking for a random float and mapping it to the right category, but it has a few obvious issues including: it is harder to look at results; the sampler won't actually know that it is a categorical space (e.g. floats are are close by will represent completely different categories).
In this simple example, TPESampler will probably find which "network_type" works well and pick it more often. In my actual use case, I have more than a thousand categories.
Edit:
I appreciate knowing peoples' experiences that Optuna just worked well without much priors. I guess it is very problem dependent. When the search space is highly structured with multiple branches of O(1000) cardinality, even getting one sample from each leaf subspace can be pretty hard.
It might be that no automatic tuner would work on my problem, but I would still appreciate knowing the answer to the actual question. Can one somehow "guide" optuna's choices in categorical distributions?