0

I'm using a CDH cluster which is kerberous enabled and I'd like to use pyhive to connect to HIVE and read HIVE tables. Here is the code I have

from pyhive import hive
from TCLIService.ttypes import TOperationState


cursor = hive.connect(host = 'xyz', port = 10000, username = 'my_username', auth = 'KERBEROS', database = 'poc', kerberos_service_name = 'hive' ).cursor()

I'm getting the value of xyz from hive-site.xml under hive.metastore.uris, however it says xyz:9083, but if I replace 10000 with 9083, it complains.

My problem is when I connect (using port = 10000), it gives me permission error when executing a query, while I can read that table if I use HIVE CLI or beeline. My question is 1) if the xyz is the value I should use? 2) which port I should use? 3) if all is correct, why I'm still getting a permission issue?

HHH
  • 6,085
  • 20
  • 92
  • 164
  • 1
    Port 9083 is for the Metastore service - used by fat clients (Spark, Pig, legacy `hive`CLI, or HiveServer2). Port 1000 is for HiveServer2, to run SQL queries from a thin client - `beeline`, JDBC, ODBC, Python drivers. – Samson Scharfrichter Aug 28 '19 at 08:06
  • 1
    Kerberos authentication uses your local Kerberos ticket (translating the Kerberos principal into a local Hadoop user), and ignores whatever you state in _`username =`_. – Samson Scharfrichter Aug 28 '19 at 08:09
  • 1
    Are you connecting from Linux, MacOS, Windows? Do you create explicitly a Kerberos ticket from `kinit` (and on Windows, which `kinit`, the one provided by Microsoft, or the one provided by Java, or a MIT Kerberos plug-in?) – Samson Scharfrichter Aug 28 '19 at 08:12
  • Read. The. Docs. It's complicated. – Samson Scharfrichter Aug 28 '19 at 08:13
  • I'm using Linux and have already created a kerberous ticket using ``kinit``. Should I change anything in my hive.connect() arguments? I'm still getting permission error – HHH Aug 28 '19 at 13:30
  • _"Permission error"_ > at which level? Not authenticated / no groups found for authenticated user / cannot contact Sentry for privs management / no permission on _default_ database / no permission on scratch dir in HDFS / no permission on scratch dir on Linux server running HiveServer2 / etc >>>> read the actual error message client-side, and dive into the HiveServer2 logs server-side to understand what actually happens... – Samson Scharfrichter Aug 28 '19 at 21:17

0 Answers0