0

I'm writing a tool which passes arguments to another command provided with the arguments like this:

foo --arg1 -b -c bar -v --number=42

In this example foo is my tool and --arg1 -b -c should be the arguments parsed by foo, while -v --number=42 are the arguments going to bar, which is the command called by foo.

So this is quite similar to strace where you can provide arguments to strace while still providing a command with custom arguments.

argparse provides parse_known_arguments() but it will parse all arguments it knows even those coming after bar.

Some tools use a special syntax element (e.g. --) to separate arguments with different semantics, but since I know foo will only process names arguments, I'd like to avoid this.

That can't be too hard to manually find the first argument you might think, and this is what I'm currently doing:

parser.add_argument("--verbose", "-v", action="store_true")
all_args = args or sys.argv[1:]
with suppress(StopIteration):
    split_at = next(i for i, e in enumerate(all_args) if not e.startswith("-" ))
    return parser.parse_args(all_args[:split_at]), all_args[split_at:]
raise RuntimeError("No command provided")

And this works with the example I've provided. But with argparse you can specify arguments with values which can but don't have to be provided with a =:

foo --file=data1 --file data2 bar -v --number=42

So here it would be much harder to manually identify data2 to be a value for the second --file argument, and bar to be the first positional argument.

My current approach is to manually split arguments (backwards) and see, if all 'left hand' arguments successfully parse:

def parse_args():
    class MyArgumentParser(ArgumentParser):
        def error(self, message: str) -> NoReturn:
            raise RuntimeError()

    parser = MyArgumentParser()
    parser.add_argument("--verbose", "-v", action="store_true")
    parser.add_argument("--with-value", type=str)

    all_args = sys.argv[1:]
    for split_at in (i for i, e in reversed(list(enumerate(all_args))) if not e.startswith("-")):
        with suppress(RuntimeError):
            return parser.parse_args(all_args[:split_at]), all_args[split_at:]

    parser.print_help(sys.stderr)
    print("No command provided", file=sys.stderr)
    raise SystemExit(-1)

That works for me, but next to the clumsy extra MyArgumentParser needed just to be able to manually handle parser errors I now need to manually classify mistakes, since an ArgumentError turns into something that occurs naturally.

So is there a way to tell argparse to parse only until the first positional argument and then stop even if there are arguments it knows after that one?

frans
  • 8,868
  • 11
  • 58
  • 132
  • What exactly is the issue with `--`? That nomenclature seems easier for all parties involved. – flakes Feb 27 '23 at 06:16
  • You've discovered the only tricks - '--' and `parse_known_args`. There is another one, `nargs='...' (REMAINDER), but that's now undocumented since it's a bit buggy. – hpaulj Feb 27 '23 at 06:33
  • "So here it would be much harder to manually identify data2 to be a value for the second --file argument, and bar to be the first positional argument." - Yes, that is the exact problem I see here, and I think it is a design issue. **What rule do you want the code to use**, in order to decide that `data2` is a value for the second `--file` argument, while `bar` is a separate thing? Should it be based on `bar` being in a list of known sub-commands? Should it be based on expecting `--file` to use exactly one subsequent value? Something else? – Karl Knechtel Feb 27 '23 at 06:43
  • (To be honest, `argparse` is definitely not my favourite standard library module, and it is on my short list of things to redesign/reimplement as part of my critique of the standard library.) – Karl Knechtel Feb 27 '23 at 06:45
  • @hpaulj: there are two problems with `--`. First is it's plainly superfluous. From the arguments I could already see, where the extra command starts, it's just a shortcoming of `argparse` I want to workaround. Second it might also be an argument to `bar`, so I still had to deal with a set of arguments which is not fully "mine" – frans Feb 27 '23 at 06:52
  • `subparsers` is a positional argument. Once the parsing is passed to the specified subparser, the main one does not resume parsing. Notice that subparsers are handled by a custom action class. I can imagine writing a Action subclass that acts like a stripped down subparsers. `subparsers` has a special `nargs='+...'` (undocumented). – hpaulj Feb 27 '23 at 08:23
  • With a subparser I'd have to introduce a special element as well (like e.g. `foo -v --file out.log run bar -v --some arg`). That's an option indeed but I'd like to do this only if there is a proved reason why I can't avoid it. – frans Feb 27 '23 at 14:34
  • @KarlKnechtel: in my example `bar` is not just some positional argument to `foo` but it's the name of another executable. Like with `strace`: `strace --arg-for strace ./some_executable --arg-for another-executable` – frans Feb 27 '23 at 14:39
  • 1
    `argparse` does not "know" what various strings represent to you or the following code. They are just strings. At a top level it classifies the strings as `options` or `arguments` (based on the dash), and then it alternates processing "positionals" and "optionals", handing out strings based on their `nargs`. There's no short-circuiting that iteration except by errors, help-style exit, and subparsers. In contrast in `optparse`, each option is given all the remaining strings, and only returns the ones it doesn't use. – hpaulj Feb 27 '23 at 15:58

1 Answers1

0

Define a parser:

In [3]: parser = argparse.ArgumentParser()
   ...: parser.add_argument('--file', action='append')
   ...: parser.add_argument('-f')
   ...: parser.add_argument('rest', nargs=argparse.PARSER);

The only thing special is this 'rest' with an undocumented nargs (same used by subparsers. It's like '+', but values, other than the first, can look like flags.

In [4]: parser.print_help()
usage: ipykernel_launcher.py [-h] [--file FILE] [-f F] rest ...

positional arguments:
  rest

options:
  -h, --help   show this help message and exit
  --file FILE
  -f F

And testing on your of inputs (I didn't account for the foo argument):

In [5]: args = parser.parse_args('--file=data1 --file data2 -f1 bar -v --number=42 --file=data3'.split())

In [6]: args
Out[6]: Namespace(file=['data1', 'data2'], f='1', rest=['bar', '-v', '--number=42', '--file=data3'])

Omit the positional, and we get an error:

In [7]: args = parser.parse_args('--file=data1 --file data2 -f1 -v --number=42 --file=data3'.split())
usage: ipykernel_launcher.py [-h] [--file FILE] [-f F] rest ...
ipykernel_launcher.py: error: the following arguments are required: rest
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • That does not fit my use case, which might have been unclear, so I've explained it a bit more yesterday: the tool I'm writing takes a full arbitrary shell command, which can have _any_ arguments, notably also arguments my tool understands. Thus the comparison to `strace`. `strace -v python -v` is a valid command with the first `-v` going to `strace` and the second to `python`. The approach with a non-optional positional argument would be unable to parse unknown args, and even `parse_known_args()` would cherry-pick items it knows. – frans Feb 28 '23 at 08:20
  • What I showed was a way to define a `positional` that consumes all the following strings. I did not change processing order. `argparse` parses the `argv` in order (data driven). It does not skip strings, look ahead, or scan from the end. Unrecognized flags are put in the `unknown` category. I tried to show how it handles repeated flags. It looks like you need to work with `sys.argv` directly, possibly with repeated calls to a parser. Your processing logic does not fit with `argparse` (and probably not any other third party parser). – hpaulj Feb 28 '23 at 16:15
  • Yeah, I guess this is what I'm currently doing as seen in my "current approach" in the question – frans Feb 28 '23 at 16:29