Why did Python `argparse` stop documenting nargs=REMAINDER?

Question

In the documentation for Python's argparse module, the 3.8 documentation states that nargs may be set to:

argparse.REMAINDER. All the remaining command-line arguments are gathered into a list. This is commonly useful for command line utilities that dispatch to other command line utilities.

This has been removed from the 3.9 documentation, though there is no mention of it being deprecated, nor any good reason to do so that I can see given given that it provides useful functionality not apparently provided by other means.¹ Its existence is still mentioned in passing elsewhere in the page:

These [intermixed] parsers do not support all the argparse features, and will raise exceptions if unsupported features are used. In particular, subparsers, argparse.REMAINDER, and mutually exclusive groups that include both optionals and positionals are not supported.

But even this is removed in the [3.10 documentation]. Yet the feature persists even in Python 3.11.4, the latest released version.

So why has it been removed from the documentation?

I ask this because the answer seems likely to bear directly on several other related questions I have about programming argument parsers in Python. (The particular situations where I was, am, and may continue to use nargs=REMAINDER are large enough that I believe that they should be posted as separate questions, if necessary.) The considerations include:

Is the API broken in some way for my purposes, and does this imply that my code that uses it also broken?
Should I look for a replacement for this API?
Should I continue to use this API in new code? After all, it hasn't been deprecated.
Should I be converting existing code that uses this API to use something else?

(Note too that the answers to questions like this will not only depend on the particular context in which nargs=REMAINDER is used, but may also be considered matters of opinion, which is another reason to leave them beyond the scope of this question.)

¹ nargs=REMAINDER is different from nargs='*': using REMAINDER means that argparse will not attempt to parse options (starting with -) from that point on. Thus, with REMAINDER, mycmd -q run bash -c exit will not attempt to parse the -c as an option to mycmd, but instead the line will be treated as mycmd -q run -- bash -c exit is with '*'.

[3.10 documentation]

I’m voting to close this question because it is a question about a decision, which can only properly be answered by the person or people who made that decision. Any reasoning anyone else could offer would be purely speculative. The most appropriate place to ask about the Python documentation - to question a decision or suggest an improvement - is https://discuss.python.org/c/documentation/26. — Karl Knechtel, Aug 22 '23 at 18:59
Removing it from the documentation has been discussed in the GITHUB issues (open or closed). I don't remember commenting on it, but my memory is that its implementation was buggy enough that the devs though it simplest to just un-document it, while still leaving it in the code. Therefor, use it at your own risk, and don't come complaining if it does not work as you want. — hpaulj, Aug 22 '23 at 19:04
I wrote that `intermixed` parser and comment - quite some time ago As noted I made no effort to test this combination of features. Also you have found the (rough) double dash equivalent. — hpaulj, Aug 22 '23 at 19:09
As a matter of curiosity, there's another undocumented `nargs`, `'A...'`, which expects at least one (non-dash) string. That's used by the `subparsers`, `argparse.PARSER`. — hpaulj, Aug 22 '23 at 19:11
https://www.reddit.com/r/learnpython/comments/trhpow/undocumented_features_eg_argparseremainder/ — philipxy, Aug 28 '23 at 02:30

Nick ODell · Answer 1 · 2023-08-28T15:29:26.450

4

You can see the discussion about why this was removed here.

Essentially, it was removed from the documentation because it was very fragile. The original author of the feature wrote a documentation example, but very small changes to that example (like putting the REMAINDER argument first) break it. Removing it from documentation steers new users away from it, without the backward compatibility problems of fully removing it. (Note: this characterization is disputed; see comments for why.)

Instead of getting the remaining arguments, an alternative approach is to get unrecognized arguments using parser.parse_known_args(). Example. This approach is still documented.

See here and here for the PRs which removed the documentation.

edited Aug 28 '23 at 15:29

answered Aug 22 '23 at 19:07

Nick ODell

15,465
3
32
66

Thanks for the references; those are good. However, there are two issues with this answer that keep me from upvoting it. 1. I don't accept your conclusion that you can't put a `REMAINDER` argument first makes it "very fragile." (Nor is this the consensus of the thread you linked.) This is probably not the place for further discussion of your and my opinions on this, however. – cjs Aug 23 '23 at 03:55
2. You can _not_ replace `REMAINDER` with `parse_known_args()` since only `REMAINDER` can be used to specify a position at which to stop consuming options. `parse_known_args()` will consume options even if they were clearly intended for the other command, if it has an option with the same name. (Or even a similar name, if you didn't explicitly disallow abbreviations.) I've [demonstrated this here](https://github.com/0cjs/example-code/tree/main/py-argparse-remainder). – cjs Aug 23 '23 at 03:56
@cjs I don't necessarily disagree; I'm just trying to summarize 13 pages of discussion, while providing alternatives that *are* documented. If you don't agree with the discussion, then you can continue to use the feature. If you don't agree with the decision to remove the docs, you can ask on the python bug tracker to have it reinstated. But I'm not a Python dev, and I have no power to help you. – Nick ODell Aug 23 '23 at 16:10
1

I'm an argparse dev, though not nearly as active now as I once was. The module is old enough that development is slow, and favors maintaining backward compatibility, rather than adding new features. It is easier to tweak the documenation to match the code than the other way around. While it is the 'official' parser, there are good third party alternatives. Also users are welcome to implement their own changes; the code is good OOP, so changes and enhancements are easily added with subclassing. – hpaulj Aug 24 '23 at 01:17
@NickODell To be clear, the issues are: 1. I don't think your summary is a fair one; I think it would be more accurate to say something like "whether it works 'properly' and is useful is disputed." 2. What you provided is an alternative that works differently, and is a substitute in some cases but not others, but your text pretty clearly claims that `REMAINDER` duplicates `parse_known_args()`, which it doesn't. These are the things that are keeping me from upvoting and accepting your answer; the links (including the `parse_known_args()` suggestion itself, are exactly what I was looking for. – cjs Aug 24 '23 at 06:30

score 1 · Answer 2 · answered Aug 29 '23 at 06:39

Feel free to skip the "Summary" section at the bottom if you don't want to read all the details.

This is based on the mostly excellent links provided in Nick ODell's answer, particular comments in those links, and other comments provided here (particuarly by hpaulj; hat tip to him).

The Problem

A key design decision in argparse is that it parses options anywhere in the command line, i.e., ls somedir/ -l is just as valid as ls -l somedir/. (Note that this differs from getopt, where the non-option arguments would be ['somedir/', '-l'] in the first example above.)

The end user can apply parsing to only a prefix of a command line by using the special -- "option," which changes the part of the command line that's passed to the parser before the parser runs, but clearly developers wanting to use argparse.REMAINDER are wanting to avoid making the user do this.

Unfortunately, it seems that nargs=REMAINDER doesn't fit well into this "parse options anywhere in the command line" system. I am not aware of the exact reasons for this (some of which may be inherent in the logic of what argparse is trying to do, as opposed to just a matter of doing the programming); I think that would be a suitable topic for another question.

This started with a bug report "argparse.REMAINDER doesn't work as first argument" (since moved to GitHub), which could also be described as a situation where the only argument uses nargs=REMAINDER. This brings up a relatively subtle problem for which there's much discussion, including how nargs=REMAINDER should work in sub-parsers, which itself opens up a whole new can of worms.

paulj3's comment goes into much of the detail about the difficulties of determining REMAINDER behaviour, determining what its behaviour should be, and documenting all of this; responses point out that he may not be correct about certain details. It also points out that the original case doesn't appear to need or want argparse:

If you don't want such a gatekeeper [by which he means functionality like -- —cjs], why [use] argparse at all? Why not use sys.argv[1:] directly?"

The Removal

In the comment above, paulj3 makes a key point:

So some sort of warning about the limitations of REMAINDER would be good. But the trick is to come up with something that is clear but succinct. The argparse documentation is already daunting to beginners.

This is generally agreed to be difficult, and nobody seems actually to have proposed specific changes to the documentation, so paulj3 says in a later comment:

Since this feature is buggy, and there isn't an easy fix, we should probably remove any mention of it from the docs. We can still leave it as an undocumented legacy feature.

There is precedent for leaving nargs constants undocumented. argparse.PARSER ('+...') is used by the subparser mechanism, but is not documented. https://bugs.python.org/issue16988

This seemed to attract enough agreement that PRs were submitted and all mention of REMAINDER was removed.

The Alternatives

(Note: this is not technically related to the original question here, but I bring this up mainly because of the common parse_known_args() confusion that arises in discussions of nargs=REMAINDER.)

Nick ODell's answer suggests, based on answers to "python argparse: unrecognized arguments", that using Parser.parse_known_args() instead of Parser.parse_args() is a replacement for nargs=REMAINDER. Some answers to the question "Getting the remaining arguments in argparse" do the same.

These are not correct. The "unrecognised arguments" question is even dealing with an entirely different problem that nargs=REMANDER will not solve. (The actual solution was not to pass sys.argv to parse_args().) And though there are two mentions of parse_known_args() in the original bug report discussion, those are is simply related to demonstrating argparse behaviour, and it is never suggested as an alternative.

As for what to do in your particular program, that's heavily dependent on the program itself and your aims, and would require a separate question giving a lot more detail about the particular situation. As far as I can tell, this is just a documentation change, and nargs=REMAINDER appears not to be going away, so you could continue to use that. This is discussed in more detail in several answers to "Getting the remaining arguments in argparse" (note again that some of those answers incorrectly suggest parse_known_args()). This answer gives code for several solutions to that particular question, and a good warning:

The best all-around answer so far is nargs=REMAINDER, but it might really depend on your use case.

Other alternatives include getopt (which always stops at the first non-option, avoiding the problem via different and less featureful parsing logic) and, as suggested above, simply using sys.argv[1:] if that fits your situation.

Summary

nargs=REMAINDER is very hard to document, and may not even fit well into the design of argparse, causing complex edge cases. So it appears that to avoid further complexity in already complex documentation, for an awkward feature that doesn't fit well with the overall philosophy of how argparse parses arguments, it was felt best simply to remove it from the documentation.

(Whether this was the best way to go is of course a matter of opinion, but to me, "don't make the documentation more complex" seems a valid argument to make.)

Why did Python `argparse` stop documenting nargs=REMAINDER?

2 Answers2

The Problem

The Removal

The Alternatives

Summary