1

I am trying to connect to a Amazon redshift table. I created the table using SQL and now I am writing a Python script to append a data frame to the database. I am unable to connect to the database and feel that I have something wrong with my syntax or something else. My code is below.

from sqlalchemy import create_engine
conn = create_engine('jdbc:redshift://username:password@localhost:port/db_name')

Here is the error I am getting.

sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string 

Thanks!

Kulwant
  • 641
  • 2
  • 11
  • 28

1 Answers1

0

There are basically two options for connecting to Amazon Redshift using Python.

Option 1: JDBC Connection

This is a traditional connection to a database. The popular choice tends to be using psycopg2 to establish the connection, since Amazon Redshift resembles a PostgreSQL database. You can download specific JDBC drivers for Redshift.

This connection would require the Redshift database to be accessible to the computer making the query, and the Security Group would need to permit access on port 5439. If you are trying to connect from a computer on the Internet, the database would need to be in a Public Subnet and set to Publicly Accessible = Yes.

See: Establish a Python Redshift Connection: A Comprehensive Guide - Learn | Hevo

Option 2: Redshift Data API

You can directly query an Amazon Redshift database by using the Boto3 library for Python, including an execute_statement() call to query data and a get_statement_result() call to retrieve the results. This also works with IAM authentication rather than having to create additional 'database users'.

There is no need to configure Security Groups for this method, since the request is made to AWS (on the Internet). It also works with Redshift databases that are in private subnets.

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470