I want to run this simple script:
from pyhive import hive
import sqlalchemy
from impala.dbapi import connect
import pandas as pd
def conn():
return connect(host='mid.impala.mycompany.com', port=21050, auth_mechanism='GSSAPI', use_ssl=True, kerberos_service_name='impala',ca_cert='/opt/cloudera/security/pki/SSrootCA.pem')
engine = sqlalchemy.create_engine('impala://', creator=conn)
pd.read_sql("SELECT * FROM giadb.a002_fnp_100 LIMIT 100", engine)
But I got this error:
"TTransportException: TTransportException(type=1, message="Could not connect to ('mid.impala.mycompany.com', 21050)")
The Impala services is load balanced. So I think I have to set in properly manner the connection string, but I need some help. Thank you
Gianluca