0

I am really new to python packaging. It already is a confusing topic with recommended ways and options that only a minority seems to apply. But to make it worse, I stumbled over this problem.

I started with the intention to write a rather small package with a really focussed purpose. My first solution included import of pandas. But I got the request to remove that dependency. So I tried to refactor the function and unsurprisingly it's slower. And slower to an extent that I can't hardly accept it.

So a solution would be to provide a package that uses pandas and a package that don't uses pandas. So that people can use either or, depending on project requirements. Now I am wondering what the best way is to provide that.

I could:

  1. Create two seperate projects with different package names. That would work, but I want to keep the code together and there are functions and code shared.
  2. Do 1. but import the shared parts from the simple package.
  3. Use subpackages in case that would result in removing dependency for the core subpackage.

What is a good way to fulfill the different needs?

FordPrefect
  • 320
  • 2
  • 11

1 Answers1

2

I think optional dependencies are a pretty good use case for this. You could define an optional dependecy named your_package[fast] that installs pandas. And in your code you could try something like:

try:
    import pandas as pd
    PANDAS_INSTALLED = True
except ImportError:
    PANDAS_INSTALLED = False

# some other code...

if PANDAS_INSTALLED:
   def your_function(...):  # pandas installed
       ...
else:
   def your_function(...):  # pandas not installed 
       ...
Clasherkasten
  • 488
  • 3
  • 9
  • Thanks, that was an unexpected solution. Is this construct idiomatic python or a bit exotic. – FordPrefect Dec 20 '22 at 12:24
  • 2
    Yes, optional dependencies (often called "extras") are the way to go, perfectly normal nothing exotic about it. – sinoroc Dec 20 '22 at 15:14
  • @sinoroc: I see that the optional dependencies are common. What appears exotic to me, is the function definition based on the attempted import. I see that it works, but wonder about implies for documentation and testing. I set up hatch to create environments with pandas and without pandas. That works but test coverage is now always bad, because in each environment I test different things. So I am just wondering if this is ment to be or more like a hack. – FordPrefect Dec 20 '22 at 16:08
  • 1
    `try: import pandas; except ImportError: ...` is the way to go. -- For code coverage I do not know. It's probably easy to find advice (on StackOverflow or elsewhere), if not then feel free to ask another question. – sinoroc Dec 20 '22 at 18:10
  • So it's good practice on StackOverflow to ask followup questions seperately? – FordPrefect Dec 20 '22 at 20:29