4

I have a lot of files in my s3 bucket, so is there any aws cli command which I can use to find a most recent file with a prefix name in s3? and how can I copy that file from s3 to my local folder? Can I use Boto3 or python library to do this?

  • Usually you would arrange your data in a format that makes this easier. You can put files in `YEAR/MONTH/DAY/file` path for example. – kichik Jul 10 '20 at 01:35

3 Answers3

7

Here's show to do it in Python:

import boto3

s3_client = boto3.client('s3')

response = s3_client.list_objects_v2(Bucket='MY-BUCKET', Prefix='foo/')
objects = sorted(response['Contents'], key=lambda obj: obj['LastModified'])

## Latest object
latest_object = objects[-1]['Key']
filename = latest_object[latest_object.rfind('/')+1:] # Remove path

# Download it to current directory
s3_client.download_file('MY-BUCKET', latest_object, filename)

Basically, you get back ALL objects, then sort them by LastModified.

Please note that the list_objects_v2() command only returns a maximum of 1000 objects. If the bucket has more, you'll need to loop or use a paginator. See: Paginators — Boto3 documentation

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
3

This command will list the 'latest' object for a given prefix:

aws s3api list-objects --bucket MY-BUCKET --prefix foo/ --query 'sort_by(Contents, &LastModified)[-1].Key' --output text

You could combine it with a copy command:

key=$(aws s3api list-objects --bucket MY-BUCKET --prefix foo/ --query 'sort_by(Contents, &LastModified)[-1].Key' --output text)
aws s3 cp s3://MY-BUCKET/$key .

The --query parameter is very powerful. See: JMESPath Tutorial

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
0

You can refer to this answer for most recent file. get last modified object from S3 CLI. For prefix to be in the object list you can just use

aws s3 ls $BUCKET --recursive | sort | grep <prefix>

Thanks

Ashish

Ashish Bhatia
  • 569
  • 5
  • 14