-1

Lets say I'm iterating over my bucket every hour to check if new objects were added. Currently I'm doing it by iterating over all objects and checking if any have modification time newer than the latest in previous iteration, which is inefficient in terms of runtime.

My code looks something like this:

DateTime lastDateTime = DateTime.MinValue;
if (checkpoint != null)
    lastDateTime = checkpoint;

List<S3Object> newFiles = new List<S3Object>();

ListObjectsV2Request request = new ListObjectsV2Request { 
BucketName = myBucketName };


ListObjectsV2Response response;

do
{
    response = await s3Client.ListObjectsV2Async(request);
    // save only new object to list
    newFiles.AddRange(response.S3Objects.Where(x => x.LastModified > lastModificationTime));
    request.ContinuationToken = response.NextContinuationToken;

} while (response.IsTruncated);
            
newFiles = newFiles.OrderByDescending(x => x.LastModified).ToList();
checkpoint = newFiles[0].LastModified;

Is there an effective way to do this without having to always ask S3 for a list of all the objects and then filtering them out?

  • 2
    The solution to find a list of objects _is_ to use events. Any other solution involves enumerating all of the files somehow and looking for new ones manually. – Anon Coward Apr 18 '22 at 14:19

1 Answers1

1

Instead of checking the files in the S3 bucket periodically, you can use the Amazon S3 event notifications to capture new object addition, object removal, etc. Refer the AWS SDK for .NET for SNS notifications samples.

Anand Sowmithiran
  • 2,591
  • 2
  • 10
  • 22
  • It is a solution but I'm looking for a way to do it without using event notifications, if it's even possible – Sasha Chernin Apr 18 '22 at 07:24
  • @SashaChernin Why not to use events? That's why the exist. – Marcin Apr 18 '22 at 07:54
  • @Marcin because events are mainly part of Amazon S3, but not other S3 services – Sasha Chernin Apr 18 '22 at 08:17
  • 1
    AWS S3 does not provide querying ability to filter based on LastModified [attribute](https://github.com/aws/aws-cli/issues/1104#issuecomment-773545885). It returns 1000 objects per response and indicates if next page of data is available. So, as and when objects are added to the bucket, using the SNS topic notification is the only efficient way to do some work on newly added object. – Anand Sowmithiran Apr 18 '22 at 11:12