I have a use case where I want to call a downstream API for all records in S3.
The API has a rate limit of 250 TPS only, so I can call the API only 250 times in one second.
Now, the records can arrive in S3 all at the same time (within few seconds). The records will be partitioned in multiple smaller files (say 1000 records each). I was thinking to keep a lambda which is triggered by S3 PUT which puts these records in a SQS queue.
There will be another lambda which polls the queue and calls the downstream API.
S3 -> lambda-1 -> AQS -> lambda-2(calls downstream API)
How can I make sure that the lambda-2 reads only 250 messages from the queue per second?
Is there any alternate design which I can look into if this won't work.