0

In my config.yaml, how to pass two datasets for e.g. cifar and cinic at once? Can I pass a multiple config groups to the defaults list?

This is for the case when I want to train my model on a mix of datasets, but I do not want to create a config group for every possible combination.

├── config.yaml
└── dataset
    ├──  cifar.yaml
    └── imagenet.yaml
    └── cinic.yaml

What I tried is as follows:

dataset: 
  - cifar
  - cinic

which resulted in following error:

Could not load train_dataset/['cifar', 'cinic']. Available options 'cifar ... '

pseudo_teetotaler
  • 1,485
  • 1
  • 15
  • 35

1 Answers1

1

Currently config groups are mutually exclusive. Support for this is planned for Hydra 1.1. See issue 499.

One possible workaround is to put everything in the config and to use interpolation:

all_datasets:
  imagenet:
    name: imagenet
  cifar10:
    name: cifar10

datasets:
  - ${all_datasets.imagenet}
  - ${all_datasets.cifar10}

This way you can override dataset to a different list of datasets from the command line (with the interpolation).

If you want to simplify the usage at the expense of some additional code, you can do something like:

all_datasets:
  ...

datasets_list:
  - imagenet
  - cifar10

datasets: []
@hydra.main(config_path="conf", config_name="config")
def my_app(cfg: DictConfig) -> None:
  for ds in cfg.datasets_list:
    cfg.datasets.append(cfg.all_datasets[ds])

if __name__ == "__main__":
    my_app()

I didn't test this but I hope you get the idea.

Omry Yadan
  • 31,280
  • 18
  • 64
  • 87
  • I made a suggestion on the GitHub issue that you opened. – Omry Yadan Sep 27 '20 at 01:39
  • Updated the answer with a couple of possible workaround. Be creative. – Omry Yadan Sep 27 '20 at 20:14
  • Yeah I did something similar of providing list of names of datasets and in my code I had path and other properties for each data. However, Can I enable config group files with your method? If yes, then which directory they should be at. `all_datasets/imagenet.yaml` However, with with both these cases, which directory the config groups should be saved? What I mean is, for e.g. `imagenet` will have different properties like `path` etc. At present they are inside dataset folder. – pseudo_teetotaler Oct 12 '20 at 16:23
  • The proposed here is a workaround without using config groups exactly because of the limitations of config groups in 1.0. (mutually exclusive). You can include files unconditionally which might help: https://hydra.cc/docs/tutorials/basic/your_first_app/defaults#non-config-group-defaults – Omry Yadan Oct 12 '20 at 17:30
  • one more query @omry, there was a link to sample projects using Hydra on hydra.cc webpage but I am unable to find it. Is that removed? :/ – pseudo_teetotaler Oct 15 '20 at 23:35
  • I don't know what you mean. there are example applications in the repo (and there are multiple links to them it from the website). There are also the dependent repositories GitHub is tracking: https://github.com/facebookresearch/hydra/network/dependents?package_id=UGFja2FnZS01ODkxMjQ0ODE%3D – Omry Yadan Oct 17 '20 at 07:36