2

Problem statement

I want the options supported in a python module to be overridable with an .yaml file, because in some cases there are too many options to be specified with non-default values.

I implemented the logic as follows.

parser = argparse.ArgumentParser()
# some parser.add statements that comes with default values
parser.add_argument("--config_path", default=None, type=str,
                    help="A yaml file for overriding parameters specification in this module.")
args = parser.parse_args()

# Override parameters
if args.config_path is not None:
    with open(args.config_path, "r") as f:
        yml_config = yaml.safe_load(f)
    for k, v in yml_config.items():
        if k in args.__dict__:
            args.__dict__[k] = v
        else:
            sys.stderr.write("Ignored unknown parameter {} in yaml.\n".format(k))

The problem is, for some options I have specific functions/lambda expressions to convert the input strings, such as:

parser.add_argument("--tokens", type=lambda x: x.split(","))

In order to apply corresponding functions when parsing option specifications in YAML, adding so many if statements does not seem a good solution. Maintaining a dictionary that changes accordingly when new options are introduced in parser object seems redundant. Is there any solution to get the type for each argument in parser object?

  • 1
    You can also define your own actions passed to `add_argument` with the `action` parameter, these are more powerful than what you can do with `type` you might want to look into that. You can find, and apply, these in the same way on your values taken from the YAML file. – Anthon Oct 21 '18 at 11:40

1 Answers1

2

If the elements that you add to the parser with add_argument start with -- then they are actually optional and usually called options. You can find these walking over the result of the _get_optonal_actions() method of the parser instance.

If you config.yaml looks like:

tokens: a,b,c

, then you can do:

import sys
import argparse
import ruamel.yaml


sys.argv[1:] = ['--config-path', 'config.yaml']  # simulate commandline

yaml = ruamel.yaml.YAML(typ='safe')

parser = argparse.ArgumentParser()
parser.add_argument("--config-path", default=None, type=str,
                    help="A yaml file for overriding parameters specification in this module.")
parser.add_argument("--tokens", type=lambda x: x.split(","))
args = parser.parse_args()


def find_option_type(key, parser):
    for opt in parser._get_optional_actions():
        if ('--' + key) in opt.option_strings:
           return opt.type
    raise ValueError

if args.config_path is not None:
    with open(args.config_path, "r") as f:
        yml_config = yaml.load(f)
    for k, v in yml_config.items():
        if k in args.__dict__:
            typ = find_option_type(k, parser)
            args.__dict__[k] = typ(v)
        else:
            sys.stderr.write("Ignored unknown parameter {} in yaml.\n".format(k))

print(args)

which gives:

Namespace(config_path='config.yaml', tokens=['a', 'b', 'c'])

Please note:

  • I am using the new API of ruamel.yaml. Loading this way is actually faster than using from ruamel import yaml; yaml.safe_load() although your config files are probably not big enough to notice.
  • I am using the file extension .yaml, as recommended in the official FAQ. You should do so as well, unless you cannot (e.g. if your filesystem doesn't allow that).
  • I use a dash in the option string --config-path instead of an underscore, this is somewhat more natural to type and automatically converted to an underscore to get valid identifier name

You might want to consider a different approach where you parse sys.argv by hand for --config-path, then set the defaults for all the options from the YAML config file and then call .parse_args(). Doing things in that order allow you to override, on the commandline, what has a value in the config file. If you do things your way, you always have to edit a config file if it has all correct values except for one.

Anthon
  • 69,918
  • 32
  • 186
  • 246
  • Thanks for your detailed reply. BTW, I'll correct the terminologies in my post accordingly. – Boris Polonsky Oct 21 '18 at 13:24
  • But `.parse_args()` does not convert/apply the type argument, correct? So the args will then just be what yaml converts them to? This may be even more problematic, when argparse parses via `append` ox `extend`, right? Is there a solution to this? – Thomas Hilger Nov 22 '21 at 13:58
  • @ThomasHilger If you have a question, please post it as such, your comment doesn't really make sense and a (non-)working example would help. Do you think there is a problem with yaml loading something as an integer and that not being converted by the explicit call to option type the parser expects although it is cast with `typ(v)`) ? – Anthon Nov 22 '21 at 17:10