I have a large test suite that contains thousands of tests of an information extraction engine. There are about five hundred inputs, and I have to extract 10-90 items of information from each input.
For maximal transparency, I test each item from each input separately, so that I can look at the pytest log and tell exactly which tests flip after a code change. But since it's inefficient to run extraction within each test dozens of times, I organized my tests so that the extraction is done only once for each input, a test-global variable is set and then each test simply checks one aspect of the result.
It seems that this set-up prevents any speedup gains when using python-xdist. Apparently, pytest will still initialize all 500 test files in one thread and then have individual workers execute the test methods - but the actual tests take no time, so I save nothing!
Is there a simple way to instruct pytest-xdist to distribute the test files rather than the test methods to the workers, so that I can parallelize the 500 calls to the engine?