2

I have a large test suite that contains thousands of tests of an information extraction engine. There are about five hundred inputs, and I have to extract 10-90 items of information from each input.

For maximal transparency, I test each item from each input separately, so that I can look at the pytest log and tell exactly which tests flip after a code change. But since it's inefficient to run extraction within each test dozens of times, I organized my tests so that the extraction is done only once for each input, a test-global variable is set and then each test simply checks one aspect of the result.

It seems that this set-up prevents any speedup gains when using python-xdist. Apparently, pytest will still initialize all 500 test files in one thread and then have individual workers execute the test methods - but the actual tests take no time, so I save nothing!

Is there a simple way to instruct pytest-xdist to distribute the test files rather than the test methods to the workers, so that I can parallelize the 500 calls to the engine?

Kilian Foth
  • 13,904
  • 5
  • 39
  • 57
  • Have you tried `--dist loadfile`? It will distribute the workers to files, but I'm not sure if the timing will be better in this case. – MrBean Bremen Jul 02 '22 at 12:40
  • Hmmm. According to the docs, it looks as if no matter which distribution strategy is selected, the controller first lets *each* worker collect all tests, verifies that they all have the same result, and *then* distributes the text executions to workers. That's no good for me, since the collection is the expensive part in my test suite. – Kilian Foth Jul 02 '22 at 16:26
  • I see - your best bet is probably to directly ask at the [pytest-xdist](https://github.com/pytest-dev/pytest-xdist/) site, e.g. create an issue or disussion item. – MrBean Bremen Jul 02 '22 at 17:15

1 Answers1

1

According to official docs - https://pytest-xdist.readthedocs.io/en/latest/distribution.html, you can use one of the following:

  • custom groups (mark all related tests with @pytest.mark.xdist_group), and run cli with flag --dist loadgroup. It can help you run entire groups (in your cases - all methods of class) in isolated worker
  • if you use OOP style (Test classes with test methods or modules with tests) - use pytest CLI flag --dist loadscope , this approach will make groups automatically (modules, or classes) and guarantees that all tests in a group run in the same process.