Questions tagged [impyla]

Impyla is a Python client for HiveServer2 implementations (e.g., Impala, Hive) for distributed query engines.

Impyla is a Python client for HiveServer2 implementations (e.g., Impala, Hive) for distributed query engines.

Features:

  • HiveServer2 compliant; works with Impala and Hive, including nested data

  • Fully DB API 2.0 (PEP 249)-compliant Python client (similar to sqlite or MySQL clients) supporting Python 2.6+ and Python 3.3+.

  • Works with Kerberos, LDAP, SSL

  • SQLAlchemy connector

  • Converter to pandas DataFrame, allowing easy integration into the Python data stack (including scikit-learn and matplotlib); but see the Ibis project for a richer experience

References:

Related tags:

52 questions
2
votes
0 answers

Could not start SASL when connect hive with LDAP

when connecting to a hive-server without authenticate, it works fine, like this: conn = connect(host='host.without.authenticate.', port=xxx, database=xxx, auth_mechanism='PLAIN') when connecting to a hive-server with ldap authenticate as follow,…
Jojo
  • 89
  • 8
2
votes
1 answer

How to handle exception in the "finally" block?

Given the following Python code: # Use impyla package to access Impala from impala.dbapi import connect import logging def process(): conn = connect(host=host, port=port) # Mocking host and port try: cursor = conn.cursor() …
Zelong
  • 2,476
  • 7
  • 31
  • 51
1
vote
0 answers

Impyla - getting SASL error for identical code on different server

We have a script using impyla that queries one server from another. It works fine and has no issues. We then copied the Anaconda environment to a different server with the exact same configuration as the original and tried the same script again and…
formicaman
  • 1,317
  • 3
  • 16
  • 32
1
vote
0 answers

Impyla - User does not have privileges to execute 'SELECT'

I am trying to connect to impala using Python (Impyla). I am able to connect, however, I always get an error message saying the user is not able to execute queries (i.e. User 'ABC' does not have privileges to execute 'SELECT'. The user shown in the…
formicaman
  • 1,317
  • 3
  • 16
  • 32
1
vote
0 answers

Impyla - Cannot find KDC for realm

I am trying to use Impyla to connect to impala on a remote server (Server1). I am able to connect and query from my local to Server1 using the following: from impala.dbapi import connect import impala.util conn = connect(host=my_impalad, port =…
formicaman
  • 1,317
  • 3
  • 16
  • 32
1
vote
1 answer

Error while running query on Impala with Superset

I'm trying to connect impala to superset, and when I test the connection prints: "Seems OK!", and when I try to see databases on impala with the SQL Editor in the left side it shows all databases without problems. Preview of Databases/Tables But…
guilherme0170
  • 123
  • 1
  • 9
1
vote
2 answers

How to understand zlib-compressed query profiles of Apache Impala

Impala currently saves query profile logs at /var/log/impala/profiles , per line in the format As mentioned in their document at…
sumit kumar
  • 150
  • 1
  • 2
  • 13
1
vote
1 answer

Python - unable to read a large file

How do I read a large table from hdfs in jupyter-notebook as a pandas DataFrame? The script is launched through the docker image. libraries: sasl==0.2.1 thrift==0.11.0 thrift-sasl==0.4a1 Impyla==0.16.2 from impala.dbapi import connect from…
MacJei
  • 11
  • 1
1
vote
1 answer

How to connect to impala using impyla or to hive using pyhive?

I am trying to connect to impala using impyla with this code: from impala.dbapi import connect conn = connect(host='host_name.com', port=21050, user='usr', password='pass', use_ssl=True, auth_mechanism='LDAP') cursor =…
psowa001
  • 725
  • 1
  • 6
  • 18
1
vote
3 answers

Using Python to connect to Impala database (thriftpy error)

What I'm trying to do is very basic: connect to an Impala db using Python: from impala.dbapi import connect conn = connect(host='impala', port=21050, auth_mechanism='PLAIN') I'm using Impyla package to do so. I got this error: Traceback (most…
ds_enth
  • 49
  • 3
  • 9
1
vote
0 answers

Is using a timestamp field with concat(to_date) the most efficient way to query previous day in Impala?

I am querying data from HDFS using Impala in a python script using the python library Impyla. The specific data is proxy data and there is tons of it. I have a script that runs daily to pull the previous day and runs statistics. Currently I am…
sectechguy
  • 2,037
  • 4
  • 28
  • 61
1
vote
0 answers

Can't connect to unsecured Hive using Pyhive/impyla. Could not start SASL error

I'm trying to acces to an unsecured hive (hive.server2.authentication is NONE) and I get the following error message in both pyhive and impala: TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism…
h3h325
  • 751
  • 1
  • 9
  • 19
1
vote
0 answers

Pip thrift and impyla not installing on Linux

I'm trying to get pip (3.6) to install the packages thrift-sasl, thrift-py and impyla, on CentOS 6, and I keep getting this error. Command "/usr/bin/python3.6 -u -c "import setuptools,…
Chase
  • 11
  • 2
1
vote
0 answers

Can't connect to Hiveserver2 using impyla

Could somebody help me solve the issue below on Windows 10? Here is a python code I have: from impala.dbapi import connect from contextlib import closing if __name__ == '__main__': with closing(connect(host='host_name_with_hiveserver2', …
1
vote
2 answers

Error Import Impyla library on Windows

I'm having trouble with using impyla library on windows I installed impyla library pip install impyla Error occured when I tried to import impyla libary in python code from impala.dbapi import connect # error occured from impala.util import…