4

I am trying to get all files in a bucket that come after a certain identifier.

Here is my code, omitted the non-relevant parts.

require_once LIB . DS . 'Aws/vendor/autoload.php';
use Aws\S3\S3Client as S3Client;
use Aws\Credentials as Credentials;

$s3 = new S3Client([
        'region'      => $region,
        'version'     => 'latest',
        'key'    => AWS_KEY,
        'secret' => AWS_SECRET,
        'credentials' => $credentials
    ]
);

$data = $s3->listObjectsV2(
                           [
                               'Bucket' => <MY BUCKET>,
                               // the response content should be after this record
                               'StartAfter' => '302760677',
                               'MaxKeys' => 2
                          ]);
print_r($data);

The result that I am getting is as follows:

Aws\Result Object
(
    [data:Aws\Result:private] => Array
        (
            [IsTruncated] => 1
            [Contents] => Array
                (
                    [0] => Array
                        (
                            [Key] => json/300705/300705046/status.json
                            [LastModified] => Aws\Api\DateTimeResult Object
                                (
                                    [date] => 2018-06-20 11:45:06.000000
                                    [timezone_type] => 2
                                    [timezone] => Z
                                )
                            [ETag] => "2777f5fabc31969108b16cd8459d3b5d"
                            [Size] => 945
                            [StorageClass] => STANDARD
                        )
                    [1] => Array
                        (
                            [Key] => json/300705/300705046/address.json
                            [LastModified] => Aws\Api\DateTimeResult Object
                                (
                                    [date] => 2018-06-20 11:45:06.000000
                                    [timezone_type] => 2
                                    [timezone] => Z
                                )

                            [ETag] => "3fd8ef54a83e93d470f5438079f51345"
                            [Size] => 477
                            [StorageClass] => STANDARD
                        )
                )
        )
)

Here, the response content returned shows data for key - 300705046, which is lesser than what I specified in my "StartAfter" node in the request.

Can anyone help me understand what I might be doing wrong.

Thanks

web-nomad
  • 6,003
  • 3
  • 34
  • 49

2 Answers2

8

StartAfter (docs) is matched against the full key.

For example, you might try requesting with:

'StartAfter' => 'json/302760/302760677/address.json',
Yarin
  • 173,523
  • 149
  • 402
  • 512
thomasmichaelwallace
  • 7,926
  • 1
  • 27
  • 33
  • 1
    Oh! That works, thanks a lot! From the docs, I assumed that it will scan the paths with the given identifier and then figure the rest. Apparently, that's not the case. – web-nomad Nov 01 '18 at 10:03
3

'StartAfter' will return any keys greater than the value. In your case you set 'StartAfter' to '302760677', which is less than the key 'json/300705/300705046/status.json'. Instead try the following:

'StartAfter' => 'json/302760/302760677'

or

'StartAfter' => 'json/302760/302760677/'

or even

'StartAfter' => 'json/302760/302760677/a'

You might want to use Prefix instead so you don't have to anticipate the exact number of items in the folder. In that case, you would use:

'Prefix' => 'json/302760/302760677/'

If you are explicitly creating s3 folder objects (e.g. 'json/302760/302760677/') and don't want to get that folder object back in your query, you could specify that as both the Prefix and the StartAfter and it would only give you the contents.

Yarin
  • 173,523
  • 149
  • 402
  • 512
Jason H
  • 81
  • 2
  • I tested as is before I posted. Not sure what to say... – Jason H Mar 04 '21 at 23:30
  • This is also in line with what I would expect from a simple > implementation of StartAfter that you'd get from a database with an order by. So it seems reasonable in addition to just working. – Jason H Mar 05 '21 at 01:31
  • You're absolutely right, deleting my comment. Sloppy on my part! Thanks for teaching me something! – Yarin Mar 05 '21 at 17:07