first of all I also read this question (since it seems to be simillar).
My problem is that I also try to connect to our Apache Hadoop system which is now secured by Kerberos. I use the impyla module to achieve this. Before Kerberos was installed on the Hadoop system this worked well. Now I tried different solutions on the internet and nothing seems to work, but I have to admit that I never worked with Kerberos before.
This is the code I use:
conn = connect (host = host,
port = port,
auth_mechanism='GSSAPI',
kerberos_service_name='impala')
db_cursor = conn.cursor()
db_cursor.execute ('SHOW DATABASES')
results = db_cursor.fetchall()
db_names = [print(x[0]) for x in results]
(host and port are passed as variables)
The error at the moment is: "no module named thrift_sasl"
Using google on that error message does not lead me to something useful, poorly. Some say that "pyKerberos" module needs to be installed, but I'm unsure if that solves the problem.
Is there something I forgot? I also have Kerberos principal and password and manage it with "MIT Kerberos Ticket Manager" But maybe I also have to provide the information in the code somehow?
Hopefully someone can help me because I'm quite stuck here. :-)