0

First time asking a question on here, though I've benefitted from the wisdom and knowledge of the community for some time.

I am working in an AWS based environment where we've been provided with Python3 kernel in JupyterHub with Postgresql and Hive also. I want to make use of data stored in both PG and Hive all within a single Python notebook, but I haven't been able to figure out how to make the connection from Python to Hive.

I've been able to load CSV files hosted on AWS S3, but can't get connected to the tables in the Hive Databases. From what I'm seeing online, PyHive is the way, but I can't seem to get a successful connection.

I use a SSO with MFA to login to all the cloud apps on AWS. How do I make the connection to the Hive?

Here's what I have so far:

def run_hive_query(db_session, query):
    
    db_user = db_session["Database User"]
    rds_token = db_session["RDS Token"]
    
    try:
        conn = hive.connect(host=hostname, port=port, username=db_user, password=rds_token, auth='CUSTOM')
        cur = conn.cursor()
        cur.execute("""{}""".format(query))
        query_results = cur.fetchall()
        return query_results   
    except Exception as e:
        return "Command failed due to {}".format(e)

Here's the error message:

"Command failed due to Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found'"

I don't have admins on this environment. Is my only solution to request SASL be installed?

0 Answers0