I have a database consisting of many items. At least one time per day I want to loop through all items in this database and for each item, call an external API to fetch current data about this item and store it in my database.
With this scenario in mind, I was thinking of using Lambda, DynamoDB and SNS in the following way:
- Scheduled Lambda (worker) that loops through all items in the DynamoDB
- For each item, publish to a SNS topic with details about that specific item
- Another Lambda (consumer/processer) listens to that SNS topic to get each item
- For each item received, perform request to external API and update the item in DynamoDB
This setup should be scalable, easy to use/configure/maintain and hopefully cost efficient as well. But will it handle if the DynamoDB has 1000+ items to loop through at least once every day? Is there fault tolerance with this setup? Will it handle if I want to trigger this more than once every day, and more importantly; will it still be cost efficient if triggered say once every hour? Is there a better way of doing this?
Somehow I feel I should use SQS but maybe it's not really useful when running serverless since you can't poll the queue to fetch new items to process?