12

I use AWS s3 for storing few thousand files per hour and it works like a charm. I'm curious to see if I can filter them based on time or date or any meta data I have with these objects. I'm able to run a node process to get the objects list and play around for date and time but not meta data. Is there any other way do it or a better option ?.

Arun
  • 1,111
  • 2
  • 9
  • 6

1 Answers1

12

No, you cannot filter on metadata with the S3 API.

To do what you're asking, you would need to List Objects (GET Bucket) on the bucket to get all the keys, then individually ask for metadata for each key (HEAD Object). Then in your own code, you can filter out objects that don't match.

Obviously, this would be very slow to run live if you have more than a few thousand objects. You'll want to either filter down to a manageable number based on prefix or keep an index yourself (elastic search, maybe?). It's common to encode some metadata in the object keys so that you can filter by prefix.

Nathaniel Waisbrot
  • 23,261
  • 7
  • 71
  • 99
  • Thanks, I can try adding prefix to objects then. – Arun Nov 24 '15 at 23:51
  • 7
    Has this answer changed since 2015? I'm looking into this right now. – Nic Cottrell Jun 06 '18 at 14:05
  • I'm also looking into lifecycle rules based on metadata values and would like to know if this has changed. – ndtreviv Jul 10 '18 at 08:40
  • In 2019, it may be worth looking at [s3select](https://aws.amazon.com/blogs/aws/s3-glacier-select/) or [AWS Athena](https://aws.amazon.com/athena/) with [AWS Glue](https://aws.amazon.com/glue/). – toast38coza Sep 02 '19 at 20:06
  • Any changes ? I am also looking at "You'll want to either filter down to a manageable number based on prefix or keep an index yourself (elastic search, maybe?)" preferably for a single bucket – xrkr Jul 12 '23 at 05:52