Sweeping over multiple configurations

Question

I'm interested in using hydra to run some experiments over various datasets. Following the documentation found here, I've set up my conf directory as follows

conf
├── config.yaml
├── dataset
│   ├── experiment_1_0.yaml
│   ├── experiment_2_0.yaml
│   ├── experiment_2_1.yaml
│   ├── experiment_3_0.yaml
│   ├── experiment_3_1.yaml
│   └── experiment_3_2.yaml
└── model
    ├── dummy.yaml
    └── naive_bayes.yaml

My config.yaml file looks like

defaults:
  - model: ???
  - dataset: ???
  - _self_

hydra:
  sweeper:
    params:
      model: glob(*)
      dataset: glob(*)

Running my main.py file using python main.py however yields the following error

You must specify 'dataset', e.g, dataset=<OPTION>
Available options:
        experiment_1_0
        experiment_2_0
        experiment_2_1
        experiment_3_0
        experiment_3_1
        experiment_3_2

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Asking me to instead specify the dataset (and were I to do that it would ask me to specify the model). I anticipate that there is a problem in the config.yaml file. Are the ??? valid? I believe I have followed the documentation closely, so how can I implement the sweep over the 12 configs?

score 1 · Answer 1 · answered Nov 08 '22 at 18:50

1

I believe the answer is to add the --multirun flag like

python main.py --multirun

and the experiments seem to run as expected

answered Nov 08 '22 at 18:50

Demetri Pananos

6,770
9
42
73

Sweeping over multiple configurations

1 Answers1