0

I was looking at our bill and apparently we are charged more than $600 for Amazon Simple Storage Service USE2-Requests-Tier2, meaning that we have more than 1 billion GET requests a month, so about 3 million every day? We made sure that none of our S3 buckets are public so attacks should not be possible. I have no idea how we are getting so many requests as we only have about 20 active users of our app everyday. Assuming that each of them were to make about 10 GET requests to our API, which uses lambda and boto3 to download 10 files from S3 bucket to the lambda's tmp folders, then returns a value, it still wouldn't make sense for us to have about 3 millions GET requests a day.

We also have another EventBridge triggered lambda, which uses Athena to query our database (S3), and will run every 2 hours. I don't know if this is a potential cause? Can anyone shed some light on this? And how we can take a better look into where and why are we getting so many GET requests? Thank you.

  • You can try create budgets https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/budgets-managing-costs.html and usethe cost explore API https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/ce-api.html – Felipe Oct 27 '21 at 05:30

1 Answers1

2

When you execute a query in Athena, during the initial query planning phase it will list the location of the table, or the locations of all the partitions of the table involved in the query. In the next phase it will make a GET request for each and every one of the objects that it found during query planning.

If your tables consists of many small files it is not uncommon to see S3 charges that are comparable or higher than the Athena charge. If those small files are Parquet files, the problem can be bigger because Athena will also do GET requests for those during query planning to figure out splits.

One way to figure out if this is the case is to enable S3 access logging on the bucket, create a new IAM session and run a query. Wait a few minutes and then look for all S3 operations that were issued with that session, that's an estimate of the S3 operations per query.

Theo
  • 131,503
  • 21
  • 160
  • 205