1

I'm using AWS Lambda to transfer data from S3 to Redshift Cluster. I added the DELIMITER ',' part in the code, but it still doesn't work. Is the syntax of the DELIMITER wrong or do I miss something in the code?

This is the lambda code :

import json
import psycopg2
import os

def lambda_handler(event, context):
    print("event collected is {}".format(event))
    for record in event['Records'] :
        s3_bucket = record['s3']['bucket']['name']
        print("Bucket name is {}".format(s3_bucket))
        s3_key = record['s3']['object']['key']
        print("Bucket key name is {}".format(s3_key))
        from_path = "s3://{}/{}".format(s3_bucket, s3_key)
        print("from path {}".format(from_path))
        Access_key = os.getenv('AWS_Access_key')
        Access_Secrete = os.getenv('AWS_Access_Secrete')
        dbname = os.getenv('dbname')
        host = os.getenv('host')
        user = os.getenv('user')
        password = os.getenv('password')
        tablename = os.getenv('tablename')
        connection = psycopg2.connect(dbname = dbname,
                                       host = host,
                                       port = '5439',
                                       user = user,
                                       password = password)
                                       
        print('after connection....')
        curs = connection.cursor()
        print('after cursor....')
        query = "COPY {} FROM '{}' CREDENTIALS 'aws_access_key_id={};aws_secret_access_key={}' DELIMITER ',' IGNOREHEADER 1 format CSV;".format(tablename,from_path,Access_key,Access_Secrete)
        print("query is {}".format(query))
        print('after query....')
        curs.execute(query)
        connection.commit()
        #print(curs.fetchmany(3))
        print('after execute....')
        curs.close()
        print('after curs close....')
        connection.close()
        print('after connection close....')

This is the table in Redshift Cluster:

CREATE TABLE "db_name"."table_example"
(
    quarter character varying(50) encode zstd,
    year  integer encode zstd,
    asin  character varying(50) NOT NULL encode zstd,
    gl_id integer encode zstd,
    tag   character varying(50) encode zstd,
    units numeric(36,4) encode zstd,
    pcogs numeric(36,4) encode zstd,
    CONSTRAINT table_example_pkey PRIMARY KEY(asin,quarter)
);

And this is the error in the logs:

Check 'stl_load_errors' system table for details.

enter image description here

Thank you

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
cey
  • 135
  • 1
  • 2
  • 10

1 Answers1

1

Update: The DELIMETER part was wrong, silly me.

The correct type: DELIMITER '\t'

Now I'm getting the error:

enter image description here

cey
  • 135
  • 1
  • 2
  • 10
  • The correct COPY command should be: querry = "COPY {} FROM '{}' CREDENTIALS 'aws_access_key_id={};aws_secret_access_key={}' DELIMITER '\t' IGNOREHEADER 1 CSV;".format(tablename,from_path,Access_key,Access_Secrete) – cey Aug 24 '22 at 12:35
  • The answer is given here: https://stackoverflow.com/questions/49372415/invalid-digits-on-redshift – cey Aug 24 '22 at 12:36