46
aws s3 cp "dist/myfile" "s3://my-bucket/production/myfile"

It always copies myfile to s3 - I would like to copy file ONLY if it does no exist, throw error otherwise. How I can do it? Or at least how I can use awscli to check if file exists?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
user606521
  • 14,486
  • 30
  • 113
  • 204
  • Related: [GH-404](https://github.com/aws/aws-cli/issues/404), [GH-1449](https://github.com/aws/aws-cli/issues/1449) and [GH-2874](https://github.com/aws/aws-cli/issues/2874) at GitHub. – kenorb Feb 13 '18 at 12:51

8 Answers8

65

You could test for the existence of a file by listing the file, and seeing whether it returns something. For example:

aws s3 ls s3://bucket/file.txt | wc -l

This would return a zero (no lines) if the file does not exist.


If you only want to copy a file if it does not exist, try the sync command, e.g.:

aws s3 sync . s3://bucket/ --exclude '*' --include 'file.txt'

This will synchronize the local file with the remote object, only copying it if it does not exist or if the local file is different to the remote object.

kenorb
  • 155,785
  • 88
  • 678
  • 743
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • 7
    `aws s3 ls s3://bucket/file.txt | wc -l` is not accurate - it will list files with that prefix, rather than the exact name. Consider the following: http://stackoverflow.com/a/17456170/391618 – T.K. Jun 12 '15 at 16:25
  • 1
    You can use it on files if you specify the `--include` and `--exclude` parameters to limit it to just one file in the directory. I have now fixed the above example to reflect this. – John Rotenstein Oct 20 '16 at 00:26
  • `aws s3 ls ... | wc -l` - there is false negative when the `aws` command does not exist or if any other error happens. – Messa Sep 09 '18 at 19:35
  • You can just flip the order of these arguments to have the same logic apply to downloading locally instead: `aws s3 sync s3://bucket/ . --exclude '*' --include 'file.txt'` – Akaisteph7 Jun 27 '22 at 16:16
10

So, turns out that "aws s3 sync" doesn't do files, only directories. If you give it a file, you get...interesting...behavior, since it treats anything you give it like a directory and throws a slash on it. At least aws-cli/1.6.7 Python/2.7.5 Darwin/13.4.0 does.

    %% date > test.txt
    %% aws s3 sync test.txt s3://bucket/test.txt
    warning: Skipping file /Users/draistrick/aws/test.txt/. File does not exist.

So, if you -really- only want to sync a file (only upload if exists, and if checksum matches) you can do it:

    file="test.txt"
    aws s3 sync --exclude '*' --include "$file" "$(dirname $file)" "s3://bucket/"

Note the exclude/include order - if you reverse that, it won't include anything. And your source and include path need to have sanity around their matching, so maybe a $(basename $file) is in order for --include if you're using full paths... aws --debug s3 sync is your friend here to see how the includes evaluate.

And don't forget the target is a directory key, not a file key.

Here's a working example:

  %% file="test.txt"
  %% date >> $file
  %% aws s3 sync --exclude '*' --include "$file" "$(dirname $file)" "s3://bucket/"
  upload: ./test.txt to s3://bucket/test.txt/test.txt
  %% aws s3 sync --exclude '*' --include "$file" "$(dirname $file)" "s3://bucket/"
  %% date >> $file
  %% aws s3 sync --exclude '*' --include "$file" "$(dirname $file)" "s3://bucket/"
  upload: ./test.txt to s3://bucket/test.txt/test.txt

(now, if only there were a way to ask aws s3 to -just- validate the checksum, since it seems to always do multipart style checksums.. oh, maybe some --dryrun and some output scraping and sync..)

keen
  • 816
  • 10
  • 11
  • not according to the docs - http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html `Syncs directories and S3 prefixes` – keen Oct 19 '16 at 22:40
  • it also says: "Recursively copies new and updated files from the source directory to the destination". And I've tried it and it seems to transfer files from local filesystem to S3. So transferring for the first time does not mean sync-ing them? – olala Oct 19 '16 at 22:58
  • the question (and this answer) was regarding the sync of a SINGLE file, not a directory containing files. `aws s3 sync` doesnt support being given the path to a source file, only to a source directory (see the first code block in my answer). so "no, `aws s3 sync` still doesn't support individual files". while this distinction might seem minor, it turns out to be significant if that's what you're trying to achieve. :) – keen Oct 28 '16 at 20:06
  • You managed to make something very simple look incredibly complicated. `aws s3 sync $(dirname foo.txt) s3://bucket --exclude='*' --include='foo.txt'` CLEARLY supports single file syncing. – miterhen Oct 04 '22 at 08:36
  • @miterhen you mean: ```aws s3 sync --exclude '*' --include "$file" "$(dirname $file)" "s3://bucket/"``` is so significantly more complicated than your example? I'm confused. one uses a variable, the other doesnt. The complexity comes from explaining how it works, and why there are a bunch of ways that it doesnt work. – keen Dec 18 '22 at 19:23
  • I was just making the point that the first sentence in your answer is wrong since it does support single file syncing via the `include` flag. Also generally a good answer to a question like this would be a 2-liner that gets straight to the point. – miterhen Dec 20 '22 at 12:25
  • sure, except that prior to my explanation in my answer 7 years ago, there were no "correct" answers. none. the "accepted" answer was updated over a year after my explanation to use include. that's awesome, I dont care, but detailed explanations are useful. (and we're not talking about obvious behavior here, either.) – keen Jan 26 '23 at 00:18
7

You can do this by listing and copying if and only if the list succeeds.

aws s3 ls "s3://my-bucket/production/myfile" || aws s3 cp "dist/myfile" "s3://my-bucket/production/myfile"

Edit: replaced && to || to have the desired effect of if list fails do copy

GMartinez
  • 301
  • 4
  • 9
aviggiano
  • 1,204
  • 17
  • 23
  • while true, it does create 2 calls to s3, with twice the opportunity for failure of the request, and twice as much need for error/retry handling... – keen Sep 25 '15 at 21:38
  • This logically does the opposite, it copies the file if it is already there, functionally it always does the copy bc aws s3 ls doesn't fail if the file isnt there – Pat Mc Aug 15 '16 at 19:26
  • In case we want to make the build fail,explicitly with CI/CD tools like CircleCI. The following modification helps. `(aws s3 ls "s3://my-bucket/production/myfile") && exit 1 || aws s3 cp "dist/myfile" "s3://my-bucket/production/myfile"` – Sudarshan Vidhate Aug 20 '19 at 11:48
  • This does not work as expected. `aws s3 ls` does not only match files with exact name but also files that start with the same name. So `aws s3 ls s3://my-bucket/test` will also match `test-foo`. – jelhan Jan 15 '21 at 13:13
2

You can also check the existence of a file by aws s3api head-object subcommand. An advantage of this over aws s3 ls is that it just requires s3:GetObject permission instead of s3:ListBucket.

$ aws s3api head-object --bucket ${BUCKET} --key ${EXISTENT_KEY}
{
    "AcceptRanges": "bytes",
    "LastModified": "Wed, 1 Jan 2020 00:00:00 GMT",
    "ContentLength": 10,
    "ETag": "\"...\"",
    "VersionId": "...",
    "ContentType": "binary/octet-stream",
    "ServerSideEncryption": "AES256",
    "Metadata": {}
}
$ echo $?
0

$ aws s3api head-object --bucket ${BUCKET} --key ${NON_EXISTENT_KEY}

An error occurred (403) when calling the HeadObject operation: Forbidden
$ echo $?
255

Note that the HTTP status code for the non-existent object depends on whether you have the s3:ListObject permission. See the API document for more details:

  • If you have the s3:ListBucket permission on the bucket, Amazon S3 returns an HTTP status code 404 ("no such key") error.
  • If you don’t have the s3:ListBucket permission, Amazon S3 returns an HTTP status code 403 ("access denied") error.
okapies
  • 165
  • 9
0

AWS HACK

You can run the following command to raise ERROR if the file already exists

  • Run aws s3 sync command to sync the file to s3, it will return the copied path if the file doesn't exist or it will give blank output if it exits
  • Run wc -c command to check the character count and raise an error if the output is zero

com=$(aws s3 sync dist/ s3://my-bucket/production/ | wc -c);if [[ $com -ne 0 ]]; then exit 1; else exit 0; fi;

OR

#!/usr/bin/env bash
com=$(aws s3 sync dist s3://my-bucket/production/ | wc -c)
echo "hello $com"
if [[ $com -ne 0 ]]; then
echo "File already exists"
exit 1
else
echo "success"
exit 0
fi
Anand Tripathi
  • 14,556
  • 1
  • 47
  • 52
0

I voted up aviggiano. Using his example above, I was able to get this to work in my windows .bat file. If the S3 path exists it will throw an error and end the batch job. If the file does not exist it will continue on to perform the copy function. Hope this helps some one.

:Step1

aws s3 ls s3://00000000000-fake-bucket/my/s3/path/inbound/test.txt && ECHO Could not copy to S3 bucket becasue S3 Object already exists, ending script. && GOTO :Failure

ECHO No file found in bucket, begin upload.

aws s3 cp Z:\MY\LOCAL\PATH\test.txt s3://00000000000-fake-bucket/my/s3/path/inbound/test.txt --exclude "*" --include "*.txt"


:Step2

ECHO YOU MADE IT, LET'S CELEBRATE

IF %ERRORLEVEL% == 0 GOTO :Success
GOTO :Failure

:Success
echo Job Endedsuccess
GOTO :ExitScript

:Failure
echo BC_Script_Execution_Complete Failure
GOTO :ExitScript

:ExitScript
Kachopsticks
  • 125
  • 1
  • 13
0

I am running AWS on windows. and this is my simple script.

rem clean work files:

if exist  SomeFileGroup_remote.txt del /q SomeFileGroup_remote.txt
if exist  SomeFileGroup_remote-fileOnly.txt del /q SomeFileGroup_remote-fileOnly.txt
if exist  SomeFileGroup_Local-fileOnly.txt del /q SomeFileGroup_Local-fileOnly.txt
if exist  SomeFileGroup_remote-Download-fileOnly.txt del /q SomeFileGroup_remote-Download-fileOnly.txt

Rem prep:

call F:\Utilities\BIN\mhedate.cmd
aws s3 ls s3://awsbucket//someuser@domain.com/BulkRecDocImg/folder/folder2/ --recursive >>SomeFileGroup_remote.txt
for /F "tokens=1,2,3,4* delims= " %%i in (SomeFileGroup_remote.txt) do @echo %%~nxl >>SomeFileGroup_remote-fileOnly.txt
dir /b temp\*.* >>SomeFileGroup_Local-fileOnly.txt
findstr  /v /I /l /G:"SomeFileGroup_Local-fileOnly.txt" SomeFileGroup_remote-fileOnly.txt >>SomeFileGroup_remote-Download-fileOnly.txt

Rem Download:

for /F "tokens=1* delims= " %%i in (SomeFileGroup_remote-Download-fileOnly.txt) do (aws s3 cp s3://awsbucket//someuser@domain.com/BulkRecDocImg/folder/folder2/%%~nxi "temp" >>"SomeFileGroup_Download_%DATE.YEAR%%DATE.MONTH%%DATE.DAY%.log")
lejlun
  • 4,140
  • 2
  • 15
  • 31
HandyManny
  • 37
  • 2
0

I Added Date to the path in-order to not override the file:

 aws cp videos/video_name.mp4 s3://BUCKET_NAME/$(date +%D-%H:%M:%S)

So that way I will have history and the existing file won't be overriddend.

Ido Bleicher
  • 709
  • 1
  • 9
  • 19