As part of benchmarking aws Athena vs server-less Redshift, I'm working on writing a load test script based on Locust and later compare the results. I'm using Python 3.10.4.
When I started working on the Athena client I'll use, I noticed that the boto3 client is async, which means I should perform 2 different api calls for each query- one for retrieving the execution id and second for fetching the results.
My concern is this async implementation will make my load tests results not accurate.
My questions:
Did anyone tried to perform such a benchmark before? If yes, I would love to hear what approaches has been taken. The benchmark I'm planning to have is based on ~30 concurrent users which going to simulate queries submissions in different complexities. I would like to benchmark by Query per second, and Query response times(median, 99%).
I noticed to PyAthena client- Is this recommended for that purpose? Any other recommended clients/approaches?
Thank you!