0

I have customer assets stored in S3 an account-related value serving as the first element in the path to each asset, e.g.

  • account-1/media/video/382476581823.mp4
  • account-1/images/2348752.png

I would like to find the total amount of storage being consumed by all assets for an account -- all the assets that have a prefix of "account-1" for the path in the above example. I have a working solution that iterates over the ObjectSummary objects returned by the S3 query but it's too slow for my needs because it performs an HTTP request for each object.

I'm wondering if it's possible to perform the calculation in an S3 query similar to what you might do with DynamoDB -- ask S3 to perform the calculation and return the total.

Note: Using aws-sdk-ruby

AndyV
  • 3,696
  • 1
  • 19
  • 17

2 Answers2

0

I should have mentioned that we're using an old 1.x version of the aws-sdk-ruby so my answer might vary from what you would find in a current version of S3 sdk.

I was able to use the AWS::S3::Client#list_objects method and iterate over those results. While this is not exactly what I was hoping for (the calculation is still performed locally), at least it avoids the HTTP HEAD request to the S3 media that is invoked when iterating over the results of the S3::Bucket#objects call.

AndyV
  • 3,696
  • 1
  • 19
  • 17
0

S3 announced a new feature, S3 Select, that lets you use SQL to query your S3 buckets. Here's the launch announcement:

https://aws.amazon.com/blogs/aws/s3-glacier-select/

Doug Schwartz
  • 85
  • 1
  • 3