0

What is the recommended module and syntax to programatically copy from an S3 csv file to a Redshift table? I've been trying with the psycopg2 module, but without success (see psycopg2 copy_expert() - how to copy in a gzipped csv file?). I've tried cur.execute(), cur.copy_expert() and cur.copy_from() - all unsuccessfully. My experience and comments I've read lead me to conclude that psycopg2, while sufficient for python-programming a postgres DB, will not work for Redshift tables for some reason. So what is the workaround if I want a Python script to do this copy?

Here is the COPY statement I want to run. The source is a gzipped csv file with a pipe delimiter. This works fine from a SQL interface like DBeaver, but I can't figure out how it would translate to Python:

'''COPY <destination_table> from 's3://bucket/my_source_file.csv.gz' CREDENTIALS <my credentials>  delimiter '|' IGNOREHEADER 1 ENCODING UTF8 IGNOREBLANK    LINES NULL AS 'NULL' EMPTYASNULL BLANKSASNULL gzip ACCEPTINVCHARS timeformat 'auto' dateformat 'auto' MAXERROR 100 compupdate on;'''
Community
  • 1
  • 1
Thom Rogers
  • 1,385
  • 2
  • 20
  • 33

2 Answers2

1

I use ODBC using the pyODBC library successfully. Just call .execute(copy-command) and you shouldn't have an issue.

0

There are plenty of examples online of connecting to Amazon Redshift from Python. For example:

They typically look like:

conn = psycopg2.connect(...)
cur = conn.cursor()
cur.execute("COPY...")
conn.commit()
Community
  • 1
  • 1
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470