I am trying to upload pandas dataframe to Redshift database using jupyter notebook. I went through the following links:
How to write data to Redshift that is a result of a dataframe created in Python?
Python: 3.6
Windows: 10
Here is the code:
import psycopg2
import pandasql as ps
from sqlalchemy import create_engine
sql_conn=create_engine('postgresql://{}:
{}@{}:XXXX/{}'.format(user,password,host,db))
df.to_sql('data',con=sql_conn,schema="schema",
index=False,if_exists='replace')
Tha table has millions of record and 100's of columns and it take hours to get uploaded.
How can I upload the using
s3
bucket using python and then upload to Redshift?
I tried this link:
How to copy csv data file to Amazon RedShift?
toRedshift = "COPY final_data from 's3://XXX/XX/data.csv' CREDENTIALS
'aws_access_key_id=XXXXXXX;aws_secret_access_key=XXXX' removequotes
delimiter ',';"
sql_conn.execute(toRedshift)
Error: Cannot COPY into nonexistent table final_data .
How to create table in Redshift using S3
csv file in Python? Any efficinet way to do it without defining column types?