0

I have the s3 bucket which will have the data in files with name as the date. I was able to fetch the data on a single day by giving the date as an input for prefix value since the data will be in that file. input: 2022/10/15

import boto3
s3 = boto3.client('s3', access_key, secret_key)
response = s3.list_objects(Bucket= bucket,Prefix = input )
print(response)

But i want to fetch the data with date range. How can i change this code works for that scenario. for example if i give the input date as 2022/10/01 , i want to fetch the data from 2022/10/01 to today. how can i iterate over the dates and fetch the data for all files under 2022/10/01, 2022/10/02 .... to today.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Lilly s
  • 5
  • 1
  • 3
  • 1
    Does this answer your question? [Print all day-dates between two dates](https://stackoverflow.com/questions/7274267/print-all-day-dates-between-two-dates) – luk2302 Oct 17 '22 at 19:09
  • 1
    That is not something S3 supports out of the box in any way. Instead generate the list of date prefixes between the two dates and then list the objects for each prefix. – luk2302 Oct 17 '22 at 19:10

1 Answers1

1

The list_objects_v2() method supports a StartAfter parameter:

StartAfter (string) -- StartAfter is where you want Amazon S3 to start listing from. Amazon S3 starts listing after this specified key. StartAfter can be any key in the bucket.

So, you could use StartAfter to commence your listing at the name of the first directory, and then receive a list of all objects after that key. Since the folders are named with dates, they will already be sorted in correct order. Just keep reading the file list until the Key no longer matches the folder-naming standard.

Rather than listing the contents of each folder, you are listing the contents of the bucket. But, that's the same result since Amazon S3 does not actually use folders.

Please note that list_objects_v2() only returns 1000 objects per call, so it might be necessary to loop through the result set using ContinuationToken.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470