1

I'm interested in using hydra to run some experiments over various datasets. Following the documentation found here, I've set up my conf directory as follows

conf
├── config.yaml
├── dataset
│   ├── experiment_1_0.yaml
│   ├── experiment_2_0.yaml
│   ├── experiment_2_1.yaml
│   ├── experiment_3_0.yaml
│   ├── experiment_3_1.yaml
│   └── experiment_3_2.yaml
└── model
    ├── dummy.yaml
    └── naive_bayes.yaml

My config.yaml file looks like

defaults:
  - model: ???
  - dataset: ???
  - _self_

hydra:
  sweeper:
    params:
      model: glob(*)
      dataset: glob(*)

Running my main.py file using python main.py however yields the following error

You must specify 'dataset', e.g, dataset=<OPTION>
Available options:
        experiment_1_0
        experiment_2_0
        experiment_2_1
        experiment_3_0
        experiment_3_1
        experiment_3_2

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Asking me to instead specify the dataset (and were I to do that it would ask me to specify the model). I anticipate that there is a problem in the config.yaml file. Are the ??? valid? I believe I have followed the documentation closely, so how can I implement the sweep over the 12 configs?

Demetri Pananos
  • 6,770
  • 9
  • 42
  • 73

1 Answers1

1

I believe the answer is to add the --multirun flag like

python main.py --multirun

and the experiments seem to run as expected

Demetri Pananos
  • 6,770
  • 9
  • 42
  • 73