12

I'm a total noob to working with AWS. I am trying to get a pretty simple and basic operation to work. What I want to do is, upon a file being uploaded to one s3 bucket, I want that upload to trigger a Lambda function that will copy that file to another bucket.

I went to the AWS management console, created an s3 bucket on the us-west2 server called "test-bucket-3x1" to use as my "source" bucket and another called "test-bucket-3x2" as my 'destination' bucket. I did not change or modify any settings when creating these buckets.

In the Lambda console, I created an s3 trigger for the 'test-bucket-3x1', changed 'event type' to "ObjectCreatedByPut", and didn't change any other settings.

This is my actual lamda_function code:

import boto3
import json
s3 = boto3.resource('s3')


def lambda_handler(event, context):
    bucket = s3.Bucket('test-bucket-3x1')
    dest_bucket = s3.Bucket('test-bucket-3x2')
    print(bucket)
    print(dest_bucket)

    for obj in bucket.objects():
        dest_key = obj.key
        print(dest_key)
        s3.Object(dest_bucket.name, dest_key).copy_from(CopySource = {'Bucket': obj.bucket_name, 'Key': obj.key})

When I test this function with the basic "HelloWorld" test available from the AWS Lambda Console, I receive this"

{
  "errorMessage": "'s3.Bucket.objectsCollectionManager' object is not callable",
  "errorType": "TypeError",
  "stackTrace": [
    [
      "/var/task/lambda_function.py",
      12,
      "lambda_handler",
      "for obj in bucket.objects():"
    ]
  ]
}

What changes do I need to make to my code in order to, upon uploading a file to test-bucket-3x1, a lambda function is triggered and the file is copied to test-bucket-3x2?

Thanks for your time.

gilch
  • 10,813
  • 1
  • 23
  • 28
Tkelly
  • 187
  • 1
  • 2
  • 11
  • 3
    Shouldn't you be using `for obj in bucket.objects.all()` instead of `for obj in bucket.objects()`. Refer this link: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.objects – krishna_mee2004 Jan 16 '18 at 17:00
  • "object isn't callable" - you're trying to iterate on that. I think you might be looking to use `bucket.objects.all()` which creates an iterable – usernamenotfound Jan 16 '18 at 17:02
  • 1
    Thanks for the help. It seems silly, but that was really useful for me. I can go to cloudwatch logs and start to get an idea for what the `event` and `context` objects actually are. On a related note, is it possible to open/work with a file in an s3 bucket via lambda? For instance, could I load a csv into a pandas dataframe, manipulate the dataframe, return the manipulated dataframe, and then upload that to my destination bucket? Would it be something as simple as putting with the event handler something like `df = pd.read_excel(['Records']['bucket']['object']['key'])`? – Tkelly Jan 16 '18 at 17:57
  • 1
    Note there is an S3 bucket replication feature in AWS, if you are genuinely copying all new objects from one bucket to another. – jarmod Dec 05 '20 at 01:33

6 Answers6

3

I would start with the blueprint s3-get-object for more information about creating a lambda from a blueprint use this page:

this is the code of the blueprint above:

console.log('Loading function');

const aws = require('aws-sdk');

const s3 = new aws.S3({ apiVersion: '2006-03-01' });


exports.handler = async (event, context) => {
    //console.log('Received event:', JSON.stringify(event, null, 2));

    // Get the object from the event and show its content type
    const bucket = event.Records[0].s3.bucket.name;
    const key = decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, ' '));
    const params = {
        Bucket: bucket,
        Key: key,
    };
    try {
        const { ContentType } = await s3.getObject(params).promise();
        console.log('CONTENT TYPE:', ContentType);
        return ContentType;
    } catch (err) {
        console.log(err);
        const message = `Error getting object ${key} from bucket ${bucket}. Make sure they exist and your bucket is in the same region as this function.`;
        console.log(message);
        throw new Error(message);
    }
};

you will then need to update the code above to not only get the object info but to do the copy and the delete of the source, and for that you can refer to this answer:

const moveAndDeleteFile = async (file,inputfolder,targetfolder) => {
    const s3 = new AWS.S3();

    const copyparams = {
        Bucket : bucketname,
        CopySource : bucketname + "/" + inputfolder + "/" + file, 
        Key : targetfolder + "/" + file
    };

    await s3.copyObject(copyparams).promise();

    const deleteparams = {
        Bucket : bucketname,
        Key : inputfolder + "/" + file
    };

    await s3.deleteObject(deleteparams).promise();
    ....
}

Source:How to copy the object from s3 to s3 using node.js

gmansour
  • 889
  • 8
  • 8
0
for object in source_bucket.objects.all():
    print(object)
    sourceObject = { 'Bucket' : 'bucketName', 'Key': object}
    destination_bucket.copy(sourceObject, object)
Dhiraj Das
  • 174
  • 9
  • 3
    Please don't only post code as an answer, but also provide an explanation what your code does and how it solves the problem of the question. That will make your answer more valuable and is more likely to attract upvotes. – Mark Rotteveel May 25 '19 at 06:29
0

You should really use the event from the lambda_handler() method to get the file [path|prefix|uri] and only deal with that file, since your event is being triggered on file being put in the bucket:

def lambda_handler(event, context):
    ...

    if event and event['Records']:
        for record in event['Records']:
            source_key = record['s3']['object']['key']

            ... # do something with the key: key-prefix/filename.ext

for the additional question about opening files from the s3Bucket directly, I would recommend to check smart_open, that "kind of" handles the s3Bucket like a local file system:

from pandas import DataFrame, read_csv
from smart_open import open

def read_as_csv(file_uri: str): -> DataFrame
    with open(file_uri) as f:
       return read_csv(f, names=COLUMN_NAMES)

thoroc
  • 3,291
  • 2
  • 27
  • 34
0
var AWS = require("aws-sdk");

exports.handler = (event, context, callback) => {
    var s3 = new AWS.S3();
    var sourceBucket = "sourcebucketskc";
    var destinationBucket = "destbucketskc";
    var objectKey = event.Records[0].s3.object.key;
    var copySource = encodeURI(sourceBucket + "/" + objectKey);
    var copyParams = { Bucket: destinationBucket, CopySource: copySource, Key: objectKey };
    s3.copyObject(copyParams, function(err, data) {
        if (err) {
            console.log(err, err.stack);
         } else {
            console.log("S3 object copy successful.");
         }
    });
};
0

You also can do it in this way:

import boto3

def copy_file_to_public_folder():
s3 = boto3.resource('s3')

src_bucket = s3.Bucket("source_bucket")
dst_bucket = "destination_bucket"

for obj in src_bucket.objects.filter(Prefix=''):
# This prefix will got all the files, but you can also use: 
# (Prefix='images/',Delimiter='/') for some specific folder
    print(obj.key)
    copy_source = {'Bucket': "source_bucket", 'Key': obj.key}

    # and here define the name of the object in the destination folder

    dst_file_name = obj.key # if you want to use the same name
    s3.meta.client.copy(copy_source, dst_bucket, dst_file_name)

This basically will take all the objects in the origin bucket and copy to the next one.

Muadiv
  • 1
  • 1
0

You can also use the datasync service in AWS to copy the objects from the source s3 bucket to the destination s3 bucket, Also we can enable the cron schedule to copy files from source and destination at a specific time.

Thank you.

veena dega
  • 11
  • 4