3

I have a mongo db with a very large collection that I need to run tests on with Pytest. I am trying to do it the usual route of using the mark.parametrize dectorator but with pymongo.cursor Cursor object:

def get_all_data():
    return db["collection"].find({}) # query to retrieve all documents from the collection

@pytest.mark.parametrize("doc", get_all_data())
def test_1(doc):
    assert doc["val"] == 1
    ....

The problem with this code is pytest in the collection stage before running tests automatically converts the generator into a list. I don't want this because of 2 reasons:

  1. This is very slow due to the fact the collection is very large.
  2. Stack overflow- Not enough RAM to load all of this data anyway.

Meaning I cannot use mark.parametrize, however how can I still use a generator to run tests 1 at a time and not to load everything immediately into memory? Is it even possible with Pytest?

tHeReaver
  • 215
  • 1
  • 10

1 Answers1

1

I can think of this workaround - write a fixture to pass the generator to a single test. Then check each entry individually in the same test using pytest-check (because i guess you need to assert each entry separately and continue even if some entries fail).

@pytest.fixture
def get_all_data():
    yield db["collection"].find({})

def test_1(get_all_data):
    for each in get_all_data:
        check.is_(each["val"], 1)
Shod
  • 801
  • 3
  • 12
  • 33
  • Maybe you could add `check` fixture into the test function signature (`def test_1(get_all_data, check)`) so that it's clear where it comes from. – tmt Nov 29 '21 at 10:49
  • 1
    Yeah that's true for fixtures in general. Pytest does a lot of magic.. For `check` though, it's an import from `pytest_check` – Shod Nov 29 '21 at 11:15