Copy Redshift table from S3 csv file using Python?

Question

What is the recommended module and syntax to programatically copy from an S3 csv file to a Redshift table? I've been trying with the psycopg2 module, but without success (see psycopg2 copy_expert() - how to copy in a gzipped csv file?). I've tried cur.execute(), cur.copy_expert() and cur.copy_from() - all unsuccessfully. My experience and comments I've read lead me to conclude that psycopg2, while sufficient for python-programming a postgres DB, will not work for Redshift tables for some reason. So what is the workaround if I want a Python script to do this copy?

Here is the COPY statement I want to run. The source is a gzipped csv file with a pipe delimiter. This works fine from a SQL interface like DBeaver, but I can't figure out how it would translate to Python:

'''COPY <destination_table> from 's3://bucket/my_source_file.csv.gz' CREDENTIALS <my credentials>  delimiter '|' IGNOREHEADER 1 ENCODING UTF8 IGNOREBLANK    LINES NULL AS 'NULL' EMPTYASNULL BLANKSASNULL gzip ACCEPTINVCHARS timeformat 'auto' dateformat 'auto' MAXERROR 100 compupdate on;'''

score 1 · Answer 1 · answered Dec 30 '16 at 22:50

1

I use ODBC using the pyODBC library successfully. Just call .execute(copy-command) and you shouldn't have an issue.

answered Dec 30 '16 at 22:50

cory.street

31
3

score 0 · Answer 2 · edited May 23 '17 at 11:53

0

There are plenty of examples online of connecting to Amazon Redshift from Python. For example:

They typically look like:

conn = psycopg2.connect(...)
cur = conn.cursor()
cur.execute("COPY...")
conn.commit()

edited May 23 '17 at 11:53

Community

1
1

answered Dec 31 '16 at 00:31

John Rotenstein

241,921
22
380
470

Copy Redshift table from S3 csv file using Python?

2 Answers2