17

When using an optional import, i.e. the package is only imported inside a function as I want it to be an optional dependency of my package, is there a way to type hint the return type of the function as one of the classes belonging to this optional dependency?

To give a simple example with pandas as an optional dependency:

def my_func() -> pd.DataFrame:                                                  
    import pandas as pd                                                         
    return pd.DataFrame()                                                       

df = my_func()

In this case, since the import statement is within my_func, this code will, not surprisingly, raise:

NameError: name 'pd' is not defined

If the string literal type hint were used instead, i.e.:

def my_func() -> 'pd.DataFrame':                                                
    import pandas as pd                                                         
    return pd.DataFrame()                                                       

df = my_func()

the module can now be executed without issue, but mypy will complain:

error: Name 'pd' is not defined

How can I make the module execute successfully and retain the static type checking capability, while also having this import be optional?

dspencer
  • 4,297
  • 4
  • 22
  • 43

3 Answers3

13

Try sticking your import inside of an if typing.TYPE_CHECKING statement at the top of your file. This variable is always false at runtime but is treated as always true for the purposes of type hinting.

For example:

# Lets us avoid needing to use forward references everywhere
# for Python 3.7+
from __future__ import annotations
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    import pandas as pd

def my_func() -> pd.DataFrame:  
    import pandas as pd                                                 
    return pd.DataFrame()

You can also do if False:, but I think that makes it a little harder for somebody to tell what's going on.

One caveat is that this does mean that while pandas will be an optional dependency at runtime, it'll still be a mandatory one for the purposes of type checking.

Another option you can explore using is mypy's --always-true and --always-false flags. This would give you finer-grained control over which parts of your code are typechecked. For example, you could do something like this:

try:
    import pandas as pd
    PANDAS_EXISTS = True
except ImportError:
    PANDAS_EXISTS = False

if PANDAS_EXISTS:
    def my_func() -> pd.DataFrame:                                                   
        return pd.DataFrame()

...then do mypy --always-true=PANDAS_EXISTS your_code.py to type check it assuming pandas is imported and mypy --always-false=PANDAS_EXISTS your_code.py to type check assuming it's missing.

This could help you catch cases where you accidentally use a function that requires pandas from a function that isn't supposed to need it -- though the caveats are that (a) this is a mypy-only solution and (b) having functions that only sometimes exist in your library might be confusing for the end-user.

Michael0x2a
  • 58,192
  • 30
  • 175
  • 224
  • In the first solution you propose, the boolean constant should be `TYPE_CHECKING`, I believe. In this case, the code will fail at runtime as `pandas` wasn't imported. Does it still require the import inside `my_func`? – dspencer Apr 24 '20 at 02:38
  • 1
    @dspencer -- Good catches. I replaced`TYPE_HINTING` with `TYPE_CHECKING` and added back the missing import to the first example. – Michael0x2a Apr 24 '20 at 07:23
  • A tempting mod of this would be using `PANDAS_EXISTS = bool(importlib.util.find_spec('pandas'))` or worse `PANDAS_EXISTS = not not importlib.util.find_spec('pandas')`, where `find_spec` returns either a `importlib.machinery.ModuleSpec` or None. Sure, error catching looks less elegant, but totally within guidelines as EAFP is a key tenant of PEP20 (Zen of Python), plus using `importlib.util` is less legible (vs. PEP8, legibility) and majorly if pandas were installed but cannot be imported due to a missing dependency it would raise an ImportError on import but `find_spec` would be oblivious – Matteo Ferla Jan 19 '22 at 10:59
1

Here's the solution I've tentatively been using, which seems to work in PyCharm's type-checker, though I haven't tried MyPy.

from typing import TypeVar, TYPE_CHECKING

PANDAS_CONFIRMED = False
if TYPE_CHECKING:
    try:
        import pandas as pd
        PANDAS_CONFIRMED = True
    except ImportError:
        pass 

if PANDAS_CONFIRMED:
    DataFrameType = pd.DataFrame
else:
    DataFrameType = TypeVar('DataFrameType')

def my_func() -> DataFrameType:  
    import pandas as pd                                                 
    return pd.DataFrame()

This has the advantage of always defining the function (so if someone runs code that calls my_func, they'll get an informative ImportError rather than a misleading AttributeError). This also always offers some sort of type-hint even when pandas is not installed, without trying to import pandas prematurely at runtime. The if-else structure makes PyCharm view some instances of DataFrameType as being Union[DataFrame, DataFrameType] but it still provides linting information that is well-suited for a DataFrame, and in some cases, like my_func's output, it somehow infers that a DataFrameType instance will always be a DataFrame.

JustinFisher
  • 607
  • 1
  • 7
  • 10
0

Another approach that can avoid problems with some linters (e.g Pylance) is this one:

from typing import Any, TYPE_CHECKING

DataFrame = Any

if TYPE_CHECKING:
    try:
        from pandas import DataFrame

    except ImportError:
        pass

DataFrameType = TypeVar("DataFrameType", bound=DataFrame)

def my_func() -> DataFrameType:  
    import pandas as pd                                                 
    return pd.DataFrame()
Pablo R. Mier
  • 719
  • 1
  • 7
  • 13