63

Suppose I have an S3 bucket named x.y.z

In this bucket, I have hundreds of files. But I only want to delete 2 files named purple.gif and worksheet.xlsx

Can I do this from the AWS command line tool with a single call to rm?

This did not work:

$ aws s3 rm s3://x.y.z/worksheet.xlsx s3://x.y.z/purple.gif
Unknown options: s3://x.y.z/purple.gif

From the manual, it doesn't seem like you can delete a list of files explicitly by name. Does anyone know a way to do it? I prefer not using the --recursive flag.

Saqib Ali
  • 11,931
  • 41
  • 133
  • 272

10 Answers10

137

You can do this by providing an --exclude or --include argument multiple times. But, you'll have to use --recursive for this to work.

When there are multiple filters, remember that the order of the filter parameters is important. The rule is the filters that appear later in the command take precedence over filters that appear earlier in the command.

aws s3 rm s3://x.y.z/ --recursive --exclude "*" --include "purple.gif" --include "worksheet.xlsx"

Here, all files will be excluded from the command except for purple.gif and worksheet.xlsx.

If you're unsure, always try a --dryrun first and inspect which files will be deleted.

Source: Use of Exclude and Include Filters

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Khalid T.
  • 10,039
  • 5
  • 45
  • 53
  • 11
    note that this will also delete any files in sub-folders matching the --include patterns – ashtonium Aug 20 '18 at 17:35
  • 7
    While this perhaps isn't the best answer for the original questioner, I'm quite sure I'm not alone coming here after searching for an answer on "how to delete multiple files from a bucket by specifying a pattern". Thank you! But I do find it bit scary, I would suggest adding a suggestion to try with the "--dryrun" parameter. – tobixen Feb 07 '19 at 09:48
  • Is the logic in all the arguments required to make this work _really_ 'recursively include everything within this folder', 'exclude everything', 'include my selections'? Is it me or is this triply tautogical? (btw: this is the AWS CLI's [documented solution](https://docs.aws.amazon.com/cli/latest/reference/s3/index.html#use-of-exclude-and-include-filters).) – leerssej Apr 08 '20 at 20:23
  • 2
    Yeah. Using something logical (to me) like `aws s3 rm s3:///test-folder/ --include "*.txt"` to remove all text files in this directory did nothing. Then `aws s3 rm s3:///test-folder/ --recursive --include "*.txt"` actually wipes ALL files AND subdirectories in `/test-folder/`! You'd think it'd only take the .txt files... huh. +1 for `--dry-run` – ericOnline Oct 21 '20 at 22:56
  • Is there a way to permanently delete objects using the CLI? – Asker Dec 08 '22 at 11:05
  • 1
    @Asker You can but with **s3api** and you must be the bucket owner and use the version Id. See [Deleting object versions from a versioning-enabled bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/DeletingObjectVersions.html) – Khalid T. Dec 11 '22 at 08:49
36

s3 rm cannot delete multiple files, but you can use s3api delete-objects to achieve what you want here.

Example

aws s3api delete-objects --bucket x.y.z --delete '{"Objects":[{"Key":"worksheet.xlsx"},{"Key":"purple.gif"}]}'
user2653663
  • 2,818
  • 1
  • 18
  • 22
spg
  • 9,309
  • 4
  • 36
  • 41
  • 10
    this works, but the "you cannot using `s3 rm`" part of the answer is *incorrect*. – ashtonium Aug 20 '18 at 17:40
  • This is the only answer that actually deletes a list of files instead of trying to create an equivalent wildcard and hoping it doesn't match anything else! – jb_dk May 31 '23 at 08:20
14

Apparently aws s3 rm works only on individual files/objects.

Below is a bash command that constructs individual delete commands and then removes the objects one by one. Works with some success (might be bit slow, but works):

aws s3 ls s3://bucketname/foldername/ | 
awk {'print "aws s3 rm s3://bucketname/foldername/" $4'} | 
bash

The first two lines are meant to construct the "rm" commands and the 3rd line (bash) will execute them.

Note that you might face issues if your object names have spaces or funny characters. This is because "aws s3 ls" command won't list such objects (as of this writing)

Thyag
  • 1,217
  • 13
  • 14
8

This command deletes files in a bucket.
aws s3 rm s3://buketname --recursive

Markus
  • 2,265
  • 5
  • 28
  • 54
samira
  • 107
  • 1
  • 1
3

You can delete multiple files using aws s3 rm. If you want to delete all files in a specific folder, just use

aws s3 rm --recursive --region <AWS_REGION> s3://<AWS_BUCKET>/<FOLDER_PATH>/

first test it with the --dryrun option!

rmuller
  • 12,062
  • 4
  • 64
  • 92
3
aws s3 rm s3://<bucketname>/2023/ --recursive --exclude '*' --include 'A*.csv' 

None of the answers above mentions how to use a wildcard expression to choose multiple files and delete.

if your usecase is to choose multiple objects in S3 with a naming pattern and delete, the above command will be useful

Anandkumar
  • 1,338
  • 13
  • 15
  • This question already has quite a few answers—including one that has been extensively validated by the community. Are you sure your approach hasn’t been given previously? **If so, it would be useful to explain how your approach is different, under what circumstances your approach might be preferred, and/or why you think the previous answers aren’t sufficient.** Can you kindly [edit] your answer to offer an explanation? – Jeremy Caney May 11 '23 at 00:29
2

If you are using AWS CLI you can filter LS results with grep regex and delete them. For example

aws s3 ls s3://BUCKET | awk '{print $4}' | grep -E -i '^2015-([0-9][0-9])\-([0-9][0-9])\-([0-9][0-9])\-([0-9][0-9])\-([0-9][0-9])\-([0-9a-zA-Z]*)' | xargs -I% bash -c 'aws s3 rm s3://BUCKET/%'

This is slow but it works

Paul Sheldrake
  • 7,505
  • 10
  • 38
  • 50
  • `xargs -I% bash -c '...%...'` introduces totally unnecessary security vulnerabilities (a key containing `$(rm -rf ~)` and you'll have a very bad day). Why not just use `xargs aws s3 rm s3://BUCKET/%` with no `bash -c`? – Charles Duffy Jun 04 '22 at 01:26
1

I found this one useful through the command line. I had more than 4 million files and it took almost a week to empty the bucket. This comes handy as the AWS console is not descriptive with the logs.

Note: You need the jq tool installed.

 aws s3api list-object-versions --bucket YOURBUCKETNAMEHERE-processed \
     --output json --query 'Versions[].[Key, VersionId]' \
     | jq -r '.[] | "--key '\''" + .[0] + "'\'' --version-id " + .[1]' \
     | xargs -L1 aws s3api delete-object --bucket YOURBUCKETNAMEHERE
charlesreid1
  • 4,360
  • 4
  • 30
  • 52
coder
  • 4,201
  • 2
  • 15
  • 22
1

This solution will work when you want to specify wildcard for object name.

aws s3 ls dmap-live-dwh-files/backup/mongodb/oms_api/hourly/ | grep order_2019_08_09_* | awk {'print "aws s3 rm s3://dmap-live-dwh-files/backup/mongodb/oms_api/hourly/" $4'} | bash 
Yogesh Patil
  • 77
  • 1
  • 2
  • `aws s3 ls --recursive BUCKET_NAME | grep PATTERN_TO_DELETE | awk '{print "aws s3 rm s3://BUCKET_NAME/" $4}' | bash` you can test before you delete by removing `| bash` – user3712451 Feb 18 '21 at 21:22
0

Quick way to delete a very large Folder in AWS

AWS_PROFILE=<AWS_PROFILE> AWS_BUCKET=<AWS_BUCKET> AWS_FOLDER=<AWS_FOLDER>; aws --profile $AWS_PROFILE s3 ls "s3://${AWS_BUCKET}/${AWS_FOLDER}/" | awk '{print $4}' | xargs -P8 -n1000 bash -c 'aws --profile '${AWS_PROFILE}' s3api delete-objects --bucket '${AWS_BUCKET}' --delete "Objects=[$(printf "{Key='${AWS_FOLDER}'/%s}," "$@")],Quiet=true" >/dev/null 2>&1'

PS: This might be launch 2/3 times because sometimes, some deletion fails...