3

Lets say I have a widely distributed/used python package called foo that's designed to work with the following dependencies:

  • pandas>=1.3.0    
  • pyarrow>=8.0
  • python>=3.8

How do I make sure that my foo package is actually compatible with all those dependencies so that people have a seamless experience with using my package?

One idea that I had is to run my test suite against a whole bunch of environments with different versions of the dependent packages. For example, run the test suite 13 times under environments with the following dependency versions:

  1. pandas=1.3.0, pyarrow=11.0, python=3.11.2
  2. pandas=1.4.0, pyarrow=11.0, python=3.11.2
  3. pandas=1.5.0, pyarrow=11.0, python=3.11.2
  4. pandas=2.0.0, pyarrow=11.0, python=3.11.2
  5. pyarrow=8.0, pandas=2.0.0, python=3.11.2
  6. pyarrow=9.0, pandas=2.0.0, python=3.11.2
  7. pyarrow=10.0, pandas=2.0.0, python=3.11.2
  8. pyarrow=11.0, pandas=2.0.0, python=3.11.2
  9. python=3.8, pandas=2.0.0, pyarrow=11.0
  10. python=3.9, pandas=2.0.0, pyarrow=11.0
  11. python=3.10, pandas=2.0.0, pyarrow=11.0
  12. python=3.11, pandas=2.0.0, pyarrow=11.0
  13. python=3.11.2, pandas=2.0.0, pyarrow=11.0

Is there a more robust way to do it? For example, what if my foo package doesn't work with pandas version 1.5.3. I don't think testing every major and minor release for all the dependent packages is feasible.

tom1919
  • 71
  • 4

2 Answers2

1

In general we may have significantly more than three deps, leading to combinatorial explosion. And mutual compatibility among the deps may be fragile, burdening you with tracking things like "A cannot import B between breaking change X and bugfix Y". Import renames will sometimes stir up trouble of that sort.

Testing e.g. pandas 1.5.0 may be of limited interest once we know the bugfixes of 1.5.3 are out.

Is there a more robust way to do it?

I recommend you adopt a "point in time" approach, so test configs resemble real user configs.

First pick a budget of K tests, and an "earliest" date. We will test between that date and current date, so initially we have 2 tests for those dates, with K - 2 remaining in the budget.

For a given historic date, scan the deps for their release dates, compute min over them, and request installation of the corresponding version number. Allow flexibility so that you get e.g. pandas 1.4.4 installed ("< 1.5") rather than the less interesting 1.4.0. Run the test, watch it succeed. Report the test's corresponding date, which is max of dates for the installed dependencies.

At this point there's two ways you could go. You might pick a single dep and constrain it (">= 1.5" or ">= 2.0") to simulate a user who wanted a certain released feature and updated that specific package. Likely a better way for you to spend the test budget is to bisect a range of your "reported" dates, locate when a dep bumped its minor version number, and adjust the constraints to pull that in. It may affect a single dep, but likely the install solver will uprev additional deps, as well, and that's fine. Report the test result, lather, rinse, consume the budget. Proudly publish the testing details on your website.


Given that everything takes a dependency on the cPython interpreter, one way to do "point in time" is to simply pick K interpreter releases and constrain the install so it demands exact match on the release number, e.g. 3.10.8. Ratchet down the various minor version numbers as far as you can get away with, e.g. pandas "< 1.5" or "< 1.4".

J_H
  • 17,926
  • 4
  • 24
  • 44
0

I just came across this article below in case it helps anyone:

https://sentry.engineering/blog/how-we-run-our-python-tests-in-hundreds-of-environments-really-fast

They test their package against hundreds of environments and looking at their repo it seems like they're testing against major versions for some packages and just the latest versions of others. It's kind of impressive that they still support python 2.7 and a dependent package version from 8 years ago. I asked them how they pick which versions of packages they test against and they said they basically test against the current versions available when they add an integration for a framework and then almost never remove that version once it's there.

Another idea i had is to ship the foo package with cloned versions of the dependent packages. For example, basically copy and past pandas as my_pandas and include it in the foo package (along with my_numpy and other dependencies, walking down the dependency tree). Internally the foo package would import my_pandas instead of using the pandas version in the users environment. I think this approach is somewhat similar to what the spyder IDE uses.

tom1919
  • 71
  • 4