Does S3 bucket has information in regard to when is the last time it has been updated? How can I find the last time any of the objects in the bucket were updated?
7 Answers
There is no native support for bucket last modified time
. The way I do it is to use aws cli
, sort the output, take the bottom line and print the first 2 fields.
$ aws s3 ls mybucket --recursive | sort | tail -n 1 | cut -d ' ' -f1,2
2016-03-18 22:46:48

- 50,176
- 7
- 137
- 145
Recommendation, tl;dr
The best compromise for a simple command that is performant, at the time of this writing based on the simplistic performance test, would be aws s3 ls --recursive
(Option #2)
3 ways to get the last modified object
1. Using s3cmd
(See s3cmd Usage, or explore the man page after installing it using sudo pip install s3cmd
)
s3cmd ls s3://the-bucket | sort| tail -n 1
2. Using AWS CLI's s3
aws s3 ls the-bucket --recursive --output text | sort | tail -n 1 | awk '{print $1"T"$2","$3","$4}'
(Note that awk
in the above refers to GNU awk. See this if you need to install this, as well as for any other GNU utilities on macOS)
3. Using AWS CLI's s3api
(with either list-objects
or list-objects-v2
)
aws s3api list-objects-v2 --bucket the-bucket | jq -r '.[] | max_by(.LastModified) | [.Key, .LastModified, .Size]|@csv'
Note that both of the s3api
commands are paginated and handling the pagination is a fundamental improvement in v2
of the list-objects.
If the bucket has more than a 1000 objects (use s3cmd du "s3://ons-dap-s-logs" | awk '{print $2}'
to get the number of objects), then you'll need to handle pagination of the API and make multiple calls to get back all the results since the sort order of the returned results is UTF-8 binary order
and not 'Last Modified'.
Performance comparison
Here is a simple performance comparison of the above three methods executed for the same bucket. For simplicity, the bucket had less than a 1000 objects. Here is the one-liner to see the execution times:
export bucket_name="the-bucket" && \
( \
time ( s3cmd ls --recursive "s3://${bucket_name}" | awk '{print $1"T"$2","$3","$4}' | sort | tail -n 1 ) & ; \
time ( aws s3 ls --recursive "${bucket_name}" --output text | awk '{print $1"T"$2","$3","$4}' | sort | tail -n 1 ) & ; \
time ( aws s3api list-objects-v2 --bucket "${bucket_name}" | jq -r '.[] | max_by(.LastModified) | [.LastModified, .Size, .Key]|@csv' ) & ; \
time ( aws s3api list-objects --bucket "${bucket_name}" | jq -r '.[] | max_by(.LastModified) | [.LastModified, .Size, .Key]|@csv' ) &
) >! output.log
(output.log
will store the last modified objects listed by each command)
The output of the above is as follows:
( s3cmd ls --recursive ...) 1.10s user 0.10s system 79% cpu 1.512 total
( aws s3 ls --recursive ...) 0.72s user 0.12s system 74% cpu 1.128 total
( aws s3api list-objects-v2 ...) 0.54s user 0.11s system 74% cpu 0.867 total
( aws s3api list-objects ...) 0.57s user 0.11s system 75% cpu 0.900 total
For the same number of objects being returned, aws s3api
calls are appreciably more performant; however, there is the additional (scripting) complexity for dealing with the pagination of the API.
Useful link(s):
See Leveraging s3 and s3api to understand the difference between aws s3
and aws s3api

- 18,501
- 4
- 62
- 91
As others have commented, there's no magic bit of metadata that stores this information. You just have to loop over the objects.
Code to do that with boto3
:
import boto3
from datetime import datetime
def bucket_last_modified(bucket_name: str) -> datetime:
"""
Given an S3 bucket, returns the last time that any of its objects was
modified, as a timezone-aware datetime.
"""
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)
objects = list(bucket.objects.all())
return max(obj.last_modified for obj in objects)

- 143,130
- 81
- 406
- 459
-
Hi what if I want to get the filename which has the last_modified time? – wawawa Aug 06 '21 at 14:58
My workaround is to write a bucket_metadata.json file to the bucket with a "last_updated" key and a unix timestamp:
{ "last_updated": 1634243586 }
Then whenever you update the bucket, you generate another timestamp and re-write the file.

- 11
- 2
Leveraging aggregation feature of the aws s3api
commmand, you can easily get some key metrics via:
aws s3api list-objects --bucket "bucket_name" --output json --query "[sum(Contents[].Size), length(Contents[]), max(Contents[].LastModified)]"
If the bucket is empty, the aggregations fail due to null values, and you will receive an error message: In function sum(), invalid type for value: None, expected one of: ['array-number'], received: "null"
If the bucket is too large, your command might get killed by the OS.

- 5,413
- 2
- 34
- 25
I have a bash script and python script can do the job, but I find that it quiet slow when the S3 bucket have millions objects, So if anyone can improve the script would be great. Bash:
#!/bin/bash
bucket_name_list=$(aws s3api list-buckets --query "Buckets[].Name" --output text)
for bucket_name in $bucket_name_list
do
# echo $bucket_name
last_access_time=$(aws s3 ls $bucket_name --recursive --output text | sort | tail -n 1 | awk '{print $1"T"$2","$3","$4}')
echo "${bucket_name}: -------> ${last_access_time}"
done
Python:
import boto3
from datetime import datetime
aws_session = boto3.session.Session(profile_name='default')
s3_resource = aws_session.resource('s3')
def bucket_last_modified() -> datetime:
for s3_bucket in s3_resource.buckets.all():
s3_bucket_name = s3_bucket.name
bucket = s3_resource.Bucket(s3_bucket_name)
objects = list(bucket.objects.all())
if len(objects) != 0:
last_access_time = max(obj.last_modified for obj in objects)
print('Bucket Name: ' + s3_bucket_name + ' last access time: ' + str(last_access_time))
else:
print('Bucket Name: ' + s3_bucket_name + ' is empty')
if __name__ == '__main__':
bucket_last_modified()

- 191
- 1
- 7
Amazon S3 API spec on GET BUCKET Object Versions (available at: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGETVersion.html) says that there's LastModified property returned - but I'm not sure if it get's updated on change for each object ...

- 108
- 6
-
This isn't the answer to the question that was asked. `LastModified`: "Date and time the *object* was last modified." This property is returned for *each individual object version*. It is not a single value for the bucket itself. – Michael - sqlbot Mar 19 '16 at 00:23
-
Yes, you are right - so I guess the only way is to recursively scan the whole subtree ... might be expensive – Krzysztof Kielak Mar 19 '16 at 19:33