Upload entire Bitbucket repo to S3 using Bitbucket Pipeline

Question

I'm using Bitbuckets Pipeline. I want it to push the entire contents of my repo (very small) to S3. I don't want to have to zip it up, push to S3 and then unzip things. I just want it to take the existing file/folder structure in my Bitbucket repo and push that to S3.

What should the yaml file and .py file look like to accomplish this?

Here is the current yaml file:

image: python:3.5.1

pipelines:
  branches:
    master:
      - step:
          script:
            # - apt-get update # required to install zip
            # - apt-get install -y zip # required if you want to zip repository objects
            - pip install boto3==1.3.0 # required for s3_upload.py
            # the first argument is the name of the existing S3 bucket to upload the artefact to
            # the second argument is the artefact to be uploaded
            # the third argument is the the bucket key
            # html files
            - python s3_upload.py my-bucket-name html/index_template.html html/index_template.html # run the deployment script
            # Example command line parameters. Replace with your values
            #- python s3_upload.py bb-s3-upload SampleApp_Linux.zip SampleApp_Linux # run the deployment script

And here is my current python:

from __future__ import print_function
import os
import sys
import argparse
import boto3
from botocore.exceptions import ClientError

def upload_to_s3(bucket, artefact, bucket_key):
    """
    Uploads an artefact to Amazon S3
    """
    try:
        client = boto3.client('s3')
    except ClientError as err:
        print("Failed to create boto3 client.\n" + str(err))
        return False
    try:
        client.put_object(
            Body=open(artefact, 'rb'),
            Bucket=bucket,
            Key=bucket_key
        )
    except ClientError as err:
        print("Failed to upload artefact to S3.\n" + str(err))
        return False
    except IOError as err:
        print("Failed to access artefact in this directory.\n" + str(err))
        return False
    return True


def main():

    parser = argparse.ArgumentParser()
    parser.add_argument("bucket", help="Name of the existing S3 bucket")
    parser.add_argument("artefact", help="Name of the artefact to be uploaded to S3")
    parser.add_argument("bucket_key", help="Name of the S3 Bucket key")
    args = parser.parse_args()

    if not upload_to_s3(args.bucket, args.artefact, args.bucket_key):
        sys.exit(1)

if __name__ == "__main__":
    main()

This requires me to list every single file in the repo in the yaml file as another command. I just want it to grab everything and upload it to S3.

@jbird He's asking the basic question of how to recursively send multiple files to S3 using the sample provided by AWS Labs for BitBucket pipelines. @scottndecker I am having the same issue as well. In Bamboo I ran a shell script to handle that: ``` #!/bin/bash export AWS_ACCESS_KEY_ID=${bamboo.awsAccessKeyId} export AWS_SECRET_ACCESS_KEY=${bamboo.awsSecretAccessKeyPassword} export AWS_DEFAULT_REGION=us-east-1 aws s3 sync dist/library s3://yourbuckethere/ --delete aws s3 sync dist/library s3://yourbuckethere/``` Have not had luck yet in Pipelines — isaac weathers, Nov 28 '16 at 19:35
Yeah looks like from this: http://boto3.readthedocs.io/en/latest/guide/migrations3.html You can config the Python script to iterate through each Key and add those to the bucket as a key. But that would need to map up to something. If you just try to do the easy way that I tried by having `src/* src` i.e. take all files in the `src` directory and upload them it fails: + python s3_upload.py patternlib-s3-upload-test src/* src usage: s3_upload.py [-h] bucket artefact bucket_key s3_upload.py: error: unrecognized arguments: src/index.md src/molecules src/pages src/release11.css src/rest src — isaac weathers, Nov 28 '16 at 20:22

lyanhhoang · Answer 1 · 2016-10-25T10:28:23.913

You can change to use docker https://hub.docker.com/r/abesiyo/s3/

It runs quite well

bitbucket-pipelines.yml

image: abesiyo/s3

pipelines:
    default:
       - step:
          script:
             - s3 --region "us-east-1" rm s3://<bucket name>
             - s3 --region "us-east-1" sync . s3://<bucket name>

Please also setup environment variables on bitbucket pipelines AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY

score 2 · Accepted Answer · answered Nov 29 '16 at 04:23

Figured it out myself. Here is the python file, 's3_upload.py'

from __future__ import print_function
import os
import sys
import argparse
import boto3
#import zipfile
from botocore.exceptions import ClientError

def upload_to_s3(bucket, artefact, is_folder, bucket_key):
    try:
        client = boto3.client('s3')
    except ClientError as err:
        print("Failed to create boto3 client.\n" + str(err))
        return False
    if is_folder == 'true':
        for root, dirs, files in os.walk(artefact, topdown=False):
            print('Walking it')
            for file in files:
                #add a check like this if you just want certain file types uploaded
                #if file.endswith('.js'):
                try:
                    print(file)
                    client.upload_file(os.path.join(root, file), bucket, os.path.join(root, file))
                except ClientError as err:
                    print("Failed to upload artefact to S3.\n" + str(err))
                    return False
                except IOError as err:
                    print("Failed to access artefact in this directory.\n" + str(err))
                    return False
                #else:
                #    print('Skipping file:' + file)
    else:
        print('Uploading file ' + artefact)
        client.upload_file(artefact, bucket, bucket_key)
    return True


def main():

    parser = argparse.ArgumentParser()
    parser.add_argument("bucket", help="Name of the existing S3 bucket")
    parser.add_argument("artefact", help="Name of the artefact to be uploaded to S3")
    parser.add_argument("is_folder", help="True if its the name of a folder")
    parser.add_argument("bucket_key", help="Name of file in bucket")
    args = parser.parse_args()

    if not upload_to_s3(args.bucket, args.artefact, args.is_folder, args.bucket_key):
        sys.exit(1)

if __name__ == "__main__":
    main()

and here is they bitbucket-pipelines.yml file:

---
image: python:3.5.1

pipelines:
  branches:
    dev:
      - step:
          script:
            - pip install boto3==1.4.1 # required for s3_upload.py
            - pip install requests
            # the first argument is the name of the existing S3 bucket to upload the artefact to
            # the second argument is the artefact to be uploaded
            # the third argument is if the artefact is a folder
            # the fourth argument is the bucket_key to use
            - python s3_emptyBucket.py dev-slz-processor-repo
            - python s3_upload.py dev-slz-processor-repo lambda true lambda
            - python s3_upload.py dev-slz-processor-repo node_modules true node_modules
            - python s3_upload.py dev-slz-processor-repo config.dev.json false config.json
    stage:
      - step:
          script:
            - pip install boto3==1.3.0 # required for s3_upload.py
            - python s3_emptyBucket.py staging-slz-processor-repo
            - python s3_upload.py staging-slz-processor-repo lambda true lambda
            - python s3_upload.py staging-slz-processor-repo node_modules true node_modules
            - python s3_upload.py staging-slz-processor-repo config.staging.json false config.json
    master:
      - step:
          script:
            - pip install boto3==1.3.0 # required for s3_upload.py
            - python s3_emptyBucket.py prod-slz-processor-repo
            - python s3_upload.py prod-slz-processor-repo lambda true lambda
            - python s3_upload.py prod-slz-processor-repo node_modules true node_modules
            - python s3_upload.py prod-slz-processor-repo config.prod.json false config.json

As an example for the dev branch, it grabs everything in the "lambda" folder, walks the entire structure of that folder, and for each item it finds, it uploads it to the dev-slz-processor-repo bucket

Lastly, here is a little helpful function, 's3_emptyBucket', to remove all objects from the bucket before uploading the new ones:

from __future__ import print_function
import os
import sys
import argparse
import boto3
#import zipfile
from botocore.exceptions import ClientError

def empty_bucket(bucket):
    try:
        resource = boto3.resource('s3')
    except ClientError as err:
        print("Failed to create boto3 resource.\n" + str(err))
        return False
    print("Removing all objects from bucket: " + bucket)
    resource.Bucket(bucket).objects.delete()
    return True


def main():

    parser = argparse.ArgumentParser()
    parser.add_argument("bucket", help="Name of the existing S3 bucket to empty")
    args = parser.parse_args()

    if not empty_bucket(args.bucket):
        sys.exit(1)

if __name__ == "__main__":
    main()

Nice. Thanks for posting. Gonna give it a try out myself and get back to you to upvote. — isaac weathers, Nov 29 '16 at 15:24
Works like a top. Thanks dude. Did have a question though. I see you upgraded from BOTO 1.3.0 to 1.4.1. Was this a requirement or just a personal pref? Also, it seemed like the standard script from AWS already replaced the existing with the latest but that was only testing with their sample linux app. Did you see a discrepancy when trying to upload and replace multiple keys/files? — isaac weathers, Nov 29 '16 at 15:50
@isaacweathers good questions. 1) Upgraded boto to 1.4.1 because of this line in the s3_emptyBucket function: "resource.Bucket(bucket).objects.delete()" which I don't think is available in 1.3.0. 2) I first empty the bucket because if a file was removed from source folder, I don't want it to hang around in the s3 bucket — Scott Decker, Nov 29 '16 at 21:27
Right on. Looking at trying to pass multiple directories from existing structure using your code. If I can't get it working I'll post a question and tag you on it to see if you have any advice. Looks like it would be best to take the file structure, zip it and then unzip in S3 but that requires using EC2 config as it is not possible to unzip in the S3 bucket. — isaac weathers, Nov 30 '16 at 01:11
I have already set the access key ID and secret access key in the environment variables. How do I pass those to the yml file? — captainblack, Apr 07 '17 at 07:57
Same here can you please tell how to pass the ID and Key in yml file — Subrata Fouzdar, Jun 05 '17 at 13:51
@SubrataFouzdar see this article https://confluence.atlassian.com/bitbucket/environment-variables-794502608.html You can reference them in the yaml with $AWS_SECRET for example. No need to "pass them" so to speak. — Scott Decker, Jun 06 '17 at 14:46

score 2 · Answer 3 · answered Feb 01 '19 at 14:51

Atlassian now offers "Pipes" to simplify configuration of some common tasks. There's one for S3 upload as well.

No need to specify a different image type:

image: node:8

pipelines:
  branches:
    master:
      - step:
          script:
            - pipe: atlassian/aws-s3-deploy:0.2.1
              variables:
                AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID
                AWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY
                AWS_DEFAULT_REGION: "us-east-1"
                S3_BUCKET: "your.bucket.name"
                LOCAL_PATH: "dist"

score 1 · Answer 4 · answered Jun 05 '17 at 15:19

For deploying a static website to Amazon S3 I have this bitbucket-pipelines.yml configuration file:

image: attensee/s3_website

pipelines:
  default:
    - step:
        script:
          - s3_website push

I’m using the attensee/s3_website docker image because that one has the awesome s3_website tool installed. The configuration file of s3_website (s3_website.yml) [create this file in the root directory of the repository in Bitbucket] looks something like this:

s3_id: <%= ENV['S3_ID'] %>
s3_secret: <%= ENV['S3_SECRET'] %>
s3_bucket: bitbucket-pipelines
site : .

We have to define the environment variables S3_ID and S3_SECRET in environment variable ,from bit-bucket settings

Thankx to https://www.savjee.be/2016/06/Deploying-website-to-ftp-or-amazon-s3-with-BitBucket-Pipelines/ for the solution

There's another docker image: shadyoak/s3_website, which downloads the latest version of s3_website. Important if you want to use some of the newer features like s3_key_prefix — Michal, May 13 '18 at 09:23

Upload entire Bitbucket repo to S3 using Bitbucket Pipeline

4 Answers4