63

Is there a way to have Python static analyzers (e.g. in PyCharm, other IDEs) pick up on Typehints on argparse.Namespace objects? Example:

parser = argparse.ArgumentParser()
parser.add_argument('--somearg')
parsed = parser.parse_args(['--somearg','someval'])  # type: argparse.Namespace
the_arg = parsed.somearg  # <- Pycharm complains that parsed object has no attribute 'somearg'

If I remove the type declaration in the inline comment, PyCharm doesn't complain, but it also doesn't pick up on invalid attributes. For example:

parser = argparse.ArgumentParser()
parser.add_argument('--somearg')
parsed = parser.parse_args(['--somearg','someval'])  # no typehint
the_arg = parsed.somaerg   # <- typo in attribute, but no complaint in PyCharm.  Raises AttributeError when executed.

Any ideas?


Update

Inspired by Austin's answer below, the simplest solution I could find is one using namedtuples:

from collections import namedtuple
ArgNamespace = namedtuple('ArgNamespace', ['some_arg', 'another_arg'])

parser = argparse.ArgumentParser()
parser.add_argument('--some-arg')
parser.add_argument('--another-arg')
parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: ArgNamespace

x = parsed.some_arg  # good...
y = parsed.another_arg  # still good...
z = parsed.aint_no_arg  # Flagged by PyCharm!

While this is satisfactory, I still don't like having to repeat the argument names. If the argument list grows considerably, it will be tedious updating both locations. What would be ideal is somehow extracting the arguments from the parser object like the following:

parser = argparse.ArgumentParser()
parser.add_argument('--some-arg')
parser.add_argument('--another-arg')
MagicNamespace = parser.magically_extract_namespace()
parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: MagicNamespace

I haven't been able to find anything in the argparse module that could make this possible, and I'm still unsure if any static analysis tool could be clever enough to get those values and not bring the IDE to a grinding halt.

Still searching...


Update 2

Per hpaulj's comment, the closest thing I could find to the method described above that would "magically" extract the attributes of the parsed object is something that would extract the dest attribute from each of the parser's _actions.:

parser = argparse.ArgumentParser()
parser.add_argument('--some-arg')
parser.add_argument('--another-arg')
MagicNamespace = namedtuple('MagicNamespace', [act.dest for act in parser._actions])
parsed = parser.parse_args(['--some-arg', 'val1', '--another-arg', 'val2'])  # type: MagicNamespace

But this still does not cause attribute errors to get flagged in static analysis. This is true also true if I pass namespace=MagicNamespace in the parser.parse_args call.

Alex Waygood
  • 6,304
  • 3
  • 24
  • 46
Billy
  • 5,179
  • 2
  • 27
  • 53
  • A quick google says that you can use type hints on the first use of local variables. Try it on `parser = argparse.ArgumentParser() # type: argparse.Namespace` and see if it works. – aghast Feb 16 '17 at 16:24
  • @Austin: `parser` in this case is an `argparse.ArgumentParser` object, not an `argparse.Namespace` object. I want the `parsed` object to be populated with the args as attributes. – Billy Feb 16 '17 at 16:26
  • You're right. I missed `parsed` vs. `parser.` What you really want seems to be that PyCharm parses the method arguments when building your ArgumentParser. I doubt that works well. – aghast Feb 16 '17 at 16:33
  • 1
    `add_argument` returns the `Action` object it just created. Look at its attributes. `parser._actions` is a list of all these actions, which the parser uses during parsing. I've mentioned them in previous SO answers. – hpaulj Feb 28 '17 at 09:17
  • In your new edits, are you passing the new namespace to the `parse_args`? – hpaulj Feb 28 '17 at 12:08
  • More on `parser._actions`, http://stackoverflow.com/a/40007285 and http://stackoverflow.com/a/39282787 – hpaulj Feb 28 '17 at 12:40
  • I added a discussion of `namedtuple` to my answer. – hpaulj Feb 28 '17 at 18:06

9 Answers9

35

Typed argument parser was made for exactly this purpose. It wraps argparse. Your example is implemented as:

from tap import Tap


class ArgumentParser(Tap):
    somearg: str


parsed = ArgumentParser().parse_args(['--somearg', 'someval'])
the_arg = parsed.somearg

Here's a picture of it in action. enter image description here

It's on PyPI and can be installed with: pip install typed-argument-parser

Full disclosure: I'm one of the creators of this library.

Jesse
  • 643
  • 6
  • 11
  • This is a solid library, but very basic. It lacks things like callbacks and doesn't support typed subparsers well, which are problems I found after spending an hour on it. While it seems like an exciting project (and they seem to have plans to resolve these issues), as of April 2022, imo it isn't a usable replacement for a moderately complex use of argparse. – Paul Biggar Apr 04 '22 at 20:30
  • I would really recommend using the `namespace` argument to `parse_args` as mentioned by @aghast in this [answer](https://stackoverflow.com/a/42279784/15429). With Python 3.6 and up you can define your own `Namespace` class with arguments that only has type hints. – amos Sep 23 '22 at 08:54
28

Consider defining an extension class to argparse.Namespace that provides the type hints you want:

class MyProgramArgs(argparse.Namespace):
    def __init__():
        self.somearg = 'defaultval' # type: str

Then use namespace= to pass that to parse_args:

def process_argv():
    parser = argparse.ArgumentParser()
    parser.add_argument('--somearg')
    nsp = MyProgramArgs()
    parsed = parser.parse_args(['--somearg','someval'], namespace=nsp)  # type: MyProgramArgs
    the_arg = parsed.somearg  # <- Pycharm should not complain
aghast
  • 14,785
  • 3
  • 24
  • 56
  • 1
    The `defaultval` defined in this class over rides any default parameters defined in the parser methods. That's probably is desirable. But it's a detail to watch out for when using custom namespaces. – hpaulj Feb 16 '17 at 23:22
  • 1
    The `__init__` is not needed for Python 3.6 and up, instead add the attributes with a type hint (no value). ```somearg: str``` – amos Sep 23 '22 at 08:36
16

Most of these answers involve using another package to handle the typing. This would be a good idea only if there wasn't such a simple solution as the one I am about to propose.

Step 1. Type Declarations

First, define the types of each argument in a dataclass like so:

from dataclasses import dataclass

@dataclass
class MyProgramArgs:
    first_var: str
    second_var: int

Step 2. Argument Declarations

Then you can set up your parser however you like with matching arguments. For example:

import argparse

parser = argparse.ArgumentParser("This CLI program uses type hints!")
parser.add_argument("-a", "--first-var")
parser.add_argument("-b", "--another-var", type=int, dest="second_var")

Step 3. Parsing the Arguments

And finally, we parse the arguments in a way that the static type checker will know about the type of each argument:

my_args = MyProgramArgs(**vars(parser.parse_args())

Now the type checker knows that my_args is of type MyProgramArgs so it knows exactly which fields are available and what their type is.

  • 2
    This is an old thread, but I wanted to say this is a perfect solution. No extra packages are required, and ridiculously easy to implement – Josh Loecker Dec 10 '22 at 17:40
  • Very good solution. Would be more involved if you had multiple subparsers I guess. I think you'd need different data classes with different attributes then. And each of these would have to share a parent class. – The Unfun Cat May 05 '23 at 07:08
  • @TheUnfunCat for multiple subparsers I suppose you could have a different class for each subparser, then based on the first arg you can choose which class to instanciate – Abraham Murciano Benzadon May 07 '23 at 08:41
  • I still like this approach, but mypy cannot validate the types. If first_var is not a string, mypy will not complain. – The Unfun Cat May 10 '23 at 10:21
  • Remark: This solution is working but exactly what OP doesn't like: "I still don't like having to repeat the argument names" – user202729 May 28 '23 at 10:47
5

I don't know anything about how PyCharm handles these typehints, but understand the Namespace code.

argparse.Namespace is a simple class; essentially an object with a few methods that make it easier to view the attributes. And for ease of unittesting it has a __eq__ method. You can read the definition in the argparse.py file.

The parser interacts with the namespace in the most general way possible - with getattr, setattr, hasattr. So you can use almost any dest string, even ones you can't access with the .dest syntax.

Make sure you don't confuse the add_argument type= parameter; that's a function.

Using your own namespace class (from scratch or subclassed) as suggested in the other answer may be the best option. This is described briefly in the documentation. Namespace Object. I haven't seen this done much, though I've suggested it a few times to handle special storage needs. So you'll have to experiment.

If using subparsers, using a custom Namespace class may break, http://bugs.python.org/issue27859

Pay attention to handling of defaults. The default default for most argparse actions is None. It is handy to use this after parsing to do something special if the user did not provide this option.

 if args.foo is None:
     # user did not use this optional
     args.foo = 'some post parsing default'
 else:
     # user provided value
     pass

That could get in the way type hints. Whatever solution you try, pay attention to the defaults.


A namedtuple won't work as a Namespace.

First, the proper use of a custom Namespace class is:

nm = MyClass(<default values>)
args = parser.parse_args(namespace=nm)

That is, you initial an instance of that class, and pass it as the parameter. The returned args will be the same instance, with new attributes set by parsing.

Second, a namedtuple can only created, it can't be changed.

In [72]: MagicSpace=namedtuple('MagicSpace',['foo','bar'])
In [73]: nm = MagicSpace(1,2)
In [74]: nm
Out[74]: MagicSpace(foo=1, bar=2)
In [75]: nm.foo='one'
...
AttributeError: can't set attribute
In [76]: getattr(nm, 'foo')
Out[76]: 1
In [77]: setattr(nm, 'foo', 'one')    # not even with setattr
...
AttributeError: can't set attribute

A namespace has to work with getattr and setattr.

Another problem with namedtuple is that it doesn't set any kind of type information. It just defines field/attribute names. So there's nothing for the static typing to check.

While it is easy to get expected attribute names from the parser, you can't get any expected types.

For a simple parser:

In [82]: parser.print_usage()
usage: ipython3 [-h] [-foo FOO] bar
In [83]: [a.dest for a in parser._actions[1:]]
Out[83]: ['foo', 'bar']
In [84]: [a.type for a in parser._actions[1:]]
Out[84]: [None, None]

The Actions dest is the normal attribute name. But type is not the expected static type of that attribute. It is a function that may or may not convert the input string. Here None means the input string is saved as is.

Because static typing and argparse require different information, there isn't an easy way to generate one from the other.

I think the best you can do is create your own database of parameters, probably in a dictionary, and create both the Namespace class and the parsesr from that, with your own utility function(s).

Let's say dd is dictionary with the necessary keys. Then we can create an argument with:

parser.add_argument(dd['short'],dd['long'], dest=dd['dest'], type=dd['typefun'], default=dd['default'], help=dd['help'])

You or someone else will have to come up with a Namespace class definition that sets the default (easy), and static type (hard?) from such a dictionary.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
3

If you are in a situation where you can start from scratch there are interesting solutions like

However, in my case they weren't an ideal solution because:

  1. I have many existing CLIs based on argparse, and I cannot afford to re-write them all using such args-inferred-from-types approaches.
  2. When inferring args from types it can be tricky to support all advanced CLI features that plain argparse supports.
  3. Re-using common arg definitions in multiple CLIs is often easier in plain imperative argparse compared to alternatives.

Therefore I worked on a tiny library typed_argparse that allows to introduce typed args without much refactoring. The idea is to add a type derived from a special TypedArg class, which then simply wraps the plain argparse.Namespace object:

# Step 1: Add an argument type.
class MyArgs(TypedArgs):
    foo: str
    num: Optional[int]
    files: List[str]


def parse_args(args: List[str] = sys.argv[1:]) -> MyArgs:
    parser = argparse.ArgumentParser()
    parser.add_argument("--foo", type=str, required=True)
    parser.add_argument("--num", type=int)
    parser.add_argument("--files", type=str, nargs="*")
    # Step 2: Wrap the plain argparser result with your type.
    return MyArgs(parser.parse_args(args))


def main() -> None:
    args = parse_args(["--foo", "foo", "--num", "42", "--files", "a", "b", "c"])
    # Step 3: Done, enjoy IDE auto-completion and strong type safety
    assert args.foo == "foo"
    assert args.num == 42
    assert args.files == ["a", "b", "c"]

This approach slightly violates the single-source-of-truth principle, but the library performs a full runtime validation to ensure that the type annotations match the argparse types, and it is just a very simple option to migrate towards typed CLIs.

bluenote10
  • 23,414
  • 14
  • 122
  • 178
2

Another way to do it which could be ideal if you have few arguments is as follows.

First make a function that sets up the parser and returns the namespace. For example:

def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser()
    parser.add_argument("-a")
    parser.add_argument("-b", type=int)
    return parser.parse_args()

Then you define a main function which takes the args you declared above individually; like so.

def main(a: str, b: int):
    print("hello world", a, b)

And when you call your main, you do it like this:

if __name__ == "__main__":
    main(**vars(parse_args())

From your main onwards, you'll have your variables a and b properly recognised by your static type checker, although you won't have an object any more containing all your arguments, which may be a good or bad thing depending on your use case.

1

a super solution to just type hint the NameSpace return value of parse_args method.

import argparse
from typing import Type


class NameSpace(argparse.Namespace, Type):
    name: str


class CustomParser(argparse.ArgumentParser):
    def parse_args(self) -> NameSpace:
        return super().parse_args()


parser = CustomParser()

parser.add_argument("--name")

if __name__ == "__main__":
    args = parser.parse_args()
    print(args.name)

PARSED ARGS

Jason Leaver
  • 286
  • 2
  • 11
  • 2
    This can be shortened by passing ```namespace=NameSpace()``` to the `parse_args()` method. No need for the custom parser. Also the `NameSpace` class doesn't need to inherit `Type`, at least from Python 3.6 and up. – amos Sep 23 '22 at 08:51
  • This does not pass mypy :/ "Invalid base class "Type"", "Signature of "parse_args" incompatible with supertype "ArgumentParser"" – The Unfun Cat May 10 '23 at 10:24
1

Here's Jason Leaver's solution with amos's suggested improvement:

import argparse

class Namespace(argparse.Namespace):
    name: str

parser = argparse.ArgumentParser()
parser.add_argument("--name")

args = parser.parse_args(namespace=Namespace())
akaihola
  • 26,309
  • 7
  • 59
  • 69
0

I'd like to contribute here with a little more involving solution that focus on the OP's concerning:

While this is satisfactory, I still don't like having to repeat the argument names. If the argument list grows considerably, it will be tedious updating both locations. What would be ideal is somehow extracting the arguments from the parser object

This solution doesn't use external packages. It uses dataclasses to define the arguments, just like Abraham Murciano's answer but the extraction of the arguments is made with the fields function. All the definition are made inside the Args dataclass.

I also added some typing tools to make the solution be complete type checked.

I'm using Pylance inside VS Code, which makes use of Pyright type checker.

1. Some definitions

Function make_arg will be used to create "args fields" and add_args will add the arguments

from __future__ import annotations
import argparse
from dataclasses import dataclass, field, fields
from typing import Sequence, Callable, Any

def make_arg(
    name: str | Sequence[str] | None = None,
    action: str = "store",
    nargs: int | str | None = None,
    const: Any = None,
    default: Any = None,
    type: Callable[[str], Any] = str,
    help: str = "",
    metavar: str | Sequence[str] | None = None,
):
    arg_dict = locals()
    if name is not None and not name[0].startswith("-"):
        raise ValueError("`name` should be passed only to flagged args")
    if isinstance(name, str):
        arg_dict["name"] = [name]
    if arg_dict["action"] in ["store_true", "store_const"]:
        arg_dict.pop("nargs")
        arg_dict.pop("const")
        arg_dict.pop("type")
        arg_dict.pop("metavar")
    arg_dict.pop("default")
    return field(default=default, metadata=arg_dict)

def add_args(args_cls: type[Args], parser: argparse.ArgumentParser):
    for arg in fields(args_cls):
        arg_dict = dict(arg.metadata)
        if arg_dict["name"] is None:  # no flagged arg
            arg_dict["name"] = [arg.name]
        else:  # flagged arg
            arg_dict["dest"] = arg.name
        arg_name = arg_dict.pop("name")
        parser.add_argument(*arg_name, **arg_dict)

2. Usage

All the args definition are made inside the Args dataclass with the function marke_arg, just once.

@dataclass
class Args:
    some_arg: str = make_arg()
    flagged_arg: str = make_arg(name="--another-arg")

# Create the parser object
parser = argparse.ArgumentParser()

# Add the args using the dataclass and the parser
add_args(Args, parser)

parsed = parser.parse_args(
    ["val1", "--another-arg", "val2"],
    namespace=Args(),
)

x = parsed.some_arg  # good...
y = parsed.flagged_arg  # still good...
z = parsed.aint_no_arg  # Flagged by PyLance!
Diogo
  • 590
  • 6
  • 23