1

I am trying to upload pandas dataframe to Redshift database using jupyter notebook. I went through the following links:

How to write data to Redshift that is a result of a dataframe created in Python?

Python: 3.6

Windows: 10

Here is the code:

import psycopg2
import pandasql as ps
from sqlalchemy import create_engine


sql_conn=create_engine('postgresql://{}: 
   {}@{}:XXXX/{}'.format(user,password,host,db))


df.to_sql('data',con=sql_conn,schema="schema",
                index=False,if_exists='replace')
  1. Tha table has millions of record and 100's of columns and it take hours to get uploaded.

  2. How can I upload the using s3 bucket using python and then upload to Redshift?

I tried this link:

How to copy csv data file to Amazon RedShift?

toRedshift = "COPY final_data from 's3://XXX/XX/data.csv' CREDENTIALS 
'aws_access_key_id=XXXXXXX;aws_secret_access_key=XXXX' removequotes  
 delimiter ',';"
 sql_conn.execute(toRedshift)

Error: Cannot COPY into nonexistent table final_data .

How to create table in Redshift using S3 csv file in Python? Any efficinet way to do it without defining column types?

MAC
  • 1,345
  • 2
  • 30
  • 60
  • 1
    I'm seeing the same issue today using COPY via psql. I've seen lots of inconsistency when running COPY commands in AWS console as well. – dbaumann Apr 21 '20 at 21:50

0 Answers0