-1

I have the following S3 bucket structure:

s3://<bucket_name>/
|---object_1/
|   |---images/
|   |   |---<image_11.jpg>
|   |   |---<image_12.jpg>
|   |---annotation/
|   |   |---<image_11.xml>
|   |   |---<image_12.xml>
|---object_2/
|   |---images/
|   |   |---<image_21.jpg>
|   |   |---<image_22.jpg>
|   |---annotation/
|   |   |---<image_21.xml>
|   |   |---<image_22.xml>

I want to move all the images and annotation files together into two separate S3 objects respectively such that the destination object structure looks like:

s3://<bucket_name>/
|---all-images/
|   |---<image_11.jpg>
|   |---<image_12.jpg>
|   |---<image_21.jpg>
|   |---<image_22.jpg>
|---all-annotation/
|   |---<image_11.xml>
|   |---<image_12.xml>
|   |---<image_21.xml>
|   |---<image_22.xml>

Question

I have tried the solution from this StackOverflow question but it does not change the S3 object structure. All the files are copied with the folder names (here, object_1/images/image_11.jpg). I want all the images together without the directory structure under one object (here, all-images/<all_the_jpg_files>). How can I achieve that using AWS CLI or Sage Maker notebook instance?

iamarchisha
  • 175
  • 7

1 Answers1

0

Step 1

Create a list of prefixes using this StackOverflow question as the reference.

Step 2

Iterate over the list of prefixes (here, object_1,obkect_2...) and cp or sync the source bucket with the destination bucket. The following command is the one I used on Sage Maker notebook instance.

!/bin/bash  
for label in list_of_prefixes:
    !aws s3 cp --recursive 's3://<bucket_name>/f"{label}"/images/' 's3://<bucket_name>/all-images/'
iamarchisha
  • 175
  • 7