1

I have a set of parameters for training and a set of parameters for tuning. They share the same name but different default values. I'd like to use argparse to define which group of default values to use and also parse the values.

I have learned it is possible by using add_subparsers to set subparser for each mode. However, their names are identical which means I'll have to set the same parameters twice (which is very long).

I also tried to include two parsers, the first one parse a few args to determine which group of default values to use, and then use parser.set_defaults(**defaults) to set the default values for the second parser, like this:

train_defaults = dict(
    optimizer='AdamW',
    lr=1e-3,
    strategy='linear',
    warmup_steps=5_000,
    weight_decay=0.3
)


tune_defaults = dict(
    optimizer='SGD',
    lr=1e-2,
    strategy='cosine',
    warmup_steps=500,
    weight_decay=0.0
)

selector = argparse.ArgumentParser(description='Mode Selector')
mode = selector.add_mutually_exclusive_group()
mode.add_argument('-t', '--train', action='store_true', help='train model')
mode.add_argument('-u', '--tune', action='store_true', help='tune model')
select, unknown = selector.parse_known_args()
defaults = tune_defaults if select.tune else select.train
parser.set_defaults(**defaults)
args, unknown = parser.parse_known_args()

But two parsers will conflict on some args, for example, -td refers to the --train_data in parser, but it will also be parsed by selector which will raise an Exception:

usage: run.py [-h] [-pt | -pa] [-t] [-u] [-v]
run.py: error: argument -t/--train: ignored explicit argument 'd'

(This is a MWE, the actual args could be vary.

  • " However, their names are identical which means I'll have to set the same parameters twice (which is very long)." Just extract the identical parameters to another function. – SyntaxRules Jan 15 '21 at 20:21
  • This seems like a case for `action='store_const'`. `--train` and `--tune` each set the same destination to one of the two `dict`s. – chepner Jan 15 '21 at 20:24
  • Is this it? Or is there more to this parser? Like definition of more arguments and a new `parse_args`? I don't understand your last errors description. – hpaulj Jan 15 '21 at 21:17
  • There are around 40 args. The above code is an example for two parse_args, but the mode selector (—train) has some conflicts with parser (—train_dir, —train_xxx). This won’t happen if there is only one parser – Zhiyuan Chen Jan 16 '21 at 04:31
  • Are you willing to try a non-argparse solution, e.g. reading environment variables? – SethMMorton Jan 17 '21 at 00:50

2 Answers2

1

The multiple parsers solution, as you are finding, can be error-prone. I see two alternatives:

Use environment variables

Something like this:

import os

do_tuning = os.getenv("DO_THE_TUNING_MODE", None) is not None

...

defaults = tune_defaults if do_tuning else select.train
parser = argparse.ArgumentParser()
...
parser.set_defaults(**defaults)
args, unknown = parser.parse_known_args()

Use like

DO_THE_TUNING_MODE=1 run.py <options>

or

export DO_THE_TUNING_MODE=1
run.py <options>

(or of course, don't set for training mode)

  • Pros:
    • Tuning/selection method is outside the parser so you don't get conflicts
    • A user can set a "state" in their shell session to tuning or training and not have to continuously set the option when running
  • Cons:
    • An environment variable is less straightforward to set than calling a command-line option for one-time use
    • It is easy to forget what your environment variable is set to

Use subparsers

This is probably the best solution. You indicated that you did not want to do that because you have so many options, but that's what functions are for.

def add_parsing_options(parser):
    # All your 40 options go here
    parser.add_argument(...)
    parser.add_argument(...)


parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers()
tuning_parser = subparser.add_parser("tune")
training_parser = subparser.add_parser("train")
add_parsing_options(tuning_parser)
add_parsing_options(training_parser)
tuning_parser.set_defaults(**tune_defaults)
training_parser.set_defaults(**train_defaults)
args, unknown = parser.parse_known_args()

Call like

run.py train <options>

or

run.py tune <options>
  • Pros:
    • It is explicit when using the tool which mode is being used
  • Cons:
    • It is an extra parameter to type every time the tool is used
SethMMorton
  • 45,752
  • 12
  • 65
  • 86
0

I partially resolved the question by some hard code, i.e. Since the first parser is only used to set the default parameters of the second parser, there is only a few arguments, in my case, 2. So what I did is to split the sys.argv to two parts:

import sys
select, unknown = selector.parse_known_args(sys.argv[:3])
args, unknown = parser.parse_known_args(sys.argv[3:])

Pros:

  • have most if not every pros of other methods
  • no extra parameter to type every time

Cons:

  • the value 3 is a hyper parameter