0

I have a fairly large dataset that I would like to use for testing a FastAPI endpoint. The issue I am having is that tests take a really long time to run. I am not sure if I have parameterized my tests correctly or if I have async setup correctly.

# Note: Func names, paths and other information is obfuscated
import pandas as pd
import pytest_asyncio
import pytest
from httpx import AsyncClient

from MODULE import app #FastAPI

dataset = pd.read_csv("data.csv")

@pytest_asyncio.fixture()
async def async_app_client():
    async with AsyncClient(app=app, base_url='http://localhost') as client:
        yield client

@pytest.mark.asyncio
@pytest.mark.parametrize("term", dataset['value'])
async def test_on_term(async_app_client, term):
    response = await async_app_client.get(f"/endpoint?text={term}")
    assert response.status_code == 200, f"{term} returned non 200 status"

Does this setup look correct? Tests run using this setup but the execution time for all the tests is abnormally long. Any other ideas to speed this up would be appreciated!

I've tried using pytest-xdist but due to the nature of the data, sometimes there are repeated values and inconsistencies between results for a given test.

kwehmeyer
  • 63
  • 8
  • The variable 'dataset' is not used, do you need to read the csv file? – Tom McLean Feb 01 '23 at 17:44
  • Sorry, a typo in my write up. @TomMcLean I've fixed it above. – kwehmeyer Feb 01 '23 at 17:45
  • Can you post the code of what "/endpoint?text=" is and also a sample of dataset['value']? – Tom McLean Feb 01 '23 at 17:53
  • I can't publish the code to the endpoint - but in short it reads in the data, processes it and does a read/write from a db. The endpoint is not specified as `async` within fastapi. An example of a `value` is any string. eg., "red firetruck", "20 foot long hotdog" – kwehmeyer Feb 01 '23 at 17:57
  • Have you measured how long each iteration of your test takes? For any large enough dataset, it will take time unless you parallelize it across many computers. – MatsLindh Feb 01 '23 at 22:35
  • Great point @MatsLindh. I did eventually end up doing this approach. I just had to ensure test cases were entirely unique in data – kwehmeyer Feb 02 '23 at 17:18

0 Answers0