Questions tagged [pyhive]
107 questions
2
votes
1 answer
How to add new dialect to Alembic besides built-in dialects?
Alembic support 5 built-in dialects only: https://github.com/sqlalchemy/alembic/tree/master/alembic/ddl
Now I want to manages schema in Apache Hive via alembic and noticed that PyHive supports SQLAlchemy interfaces so technically Alembic can support…

shawnzhu
- 7,233
- 4
- 35
- 51
2
votes
1 answer
Using pyhive with kerberos ticket to connect to kerberized hadoop cluster
I would like to connect to Hive on our kerberized Hadoop cluster and then run some hql queries (obviously haha :)) from machine, which already has its own Kerberose Client and it works, keytab has been passed and tested.
Our Hadoop runs HWS 3.1 and…

la_femme_it
- 632
- 10
- 24
2
votes
0 answers
PyHive - Long running query timeout - [Errno 110] Connection timed out
I am running Hive queries from Python using PyHive. One of the query is taking around 12-15 minutes to complete. I could see it completed on Hadoop ResourceManager UI, however I am seeing bellow timeout error on Python.
Error:Traceback (most recent…

Mahendra
- 21
- 3
2
votes
3 answers
Python/PyHive - extract specific error message from exception
I'm facing an exception similar to this one and I'm trying to handle it based on the error itself.
The problem is that pyhive.exc.OperationalError is very generic and handles errors from timeouts to non-existent tables so I would need the exact…

Craig
- 1,929
- 5
- 30
- 51
2
votes
3 answers
Insert values from file to an existing table on hive
I am new to hadoop ecosystem. I was trying to create hive table from CSV file using the below query.
CREATE EXTERNAL TABLE IF NOT EXISTS proxy_data(
date_time TIMESTAMP,time_taken INT, c_ip STRING,
sc_status INT, s_action STRING, sc_bytes INT,
…

Akash
- 100
- 1
- 10
2
votes
0 answers
Error Can't use pyhive to connect to Hive Database
This is the code use to connect to our hive database , it ran fine a week back but now it seems to be be failing to even open a session and get a cursor to execute the queries. The issue was temporarily fixed when i explicitly added a…

Abhikalp Unakal
- 41
- 6
2
votes
2 answers
How can i connect to presto pyhive?
I want to connect presto using pyhive in zeppelin
now, I follows about https://github.com/dropbox/PyHive
I use the connect function and correct parameters.
%python
from pyhive import presto
cursor = presto.connect(host='localhost',
…

lil
- 439
- 3
- 8
- 17
2
votes
1 answer
Python Pyhive module cannot import name hive
I want to connect Python to hive using pyhive. I'm using the below python script to be executed on my local.
#!/usr/bin/env python
# coding: utf-8
from pyhive import hive
from TCLIService.ttypes import TOperationState
def mysql_connect(host, port,…

user7422128
- 902
- 4
- 17
- 41
1
vote
0 answers
Standalone program runs on python 3.9 to connect hive using pyhive but AWS lambda throws No module named 'sasl.saslwrapper'
I have a small program that gets the list of databases in hive via thrift server endpoint. I am using pyhive. When I run it as a standalone program it works perfectly fine. I am using Python3.9.
Now when I run the same code from from Lambda, it…

Ashish Kumar Mondal
- 459
- 6
- 13
1
vote
0 answers
Connecting to multiple hosts in Hive with SqlAlchemy
I've already had a working connection through ODBC using Cloudera ODBC Driver for Apache Hive, where I had my DSN set and all I needed was to call pyodbc.connect(f"DSN={mydsn}", autocommit=True).
Since I'm planning to use pandas on the query result,…

Kropiciel
- 75
- 6
1
vote
1 answer
How to use dbt seed properly with dbt-spark[PyHive] running in EMR?
Problem
I am trying to implement a new process using dbt seeds. When I use it in a Redshift connection there is no problem, but when I try to use it with dbt-spark[PyHive] in EMR some problems arise.
First Try
seed-paths: ["seeds"]
seeds:
…

Camila Lima
- 11
- 2
1
vote
0 answers
pyhive is it possible to stop a query job if hit ctrl+c
I uses pyhive in jupyter to connect to hive/presto for some adhoc analysis. Something annoying is if I cancel a submitted query job via 'ctrl + c', it only stops the jupyter, but won't stop the query job remotely. Is there a way, when 'ctrl + c', it…

user1269298
- 717
- 2
- 8
- 26
1
vote
0 answers
Error while installing SASL in Python 3.9.5 windows
Am trying to install PyHive to execute Hive queries in Python. As a pre-requisite I tried installing SASL and it gives the below error. My python version is 3.9.5 and am using Windows OS.
Can someone please help with this error?
ERROR: Command…

rajeshnov10
- 13
- 2
1
vote
0 answers
pyHive error TTransportException: Could not start SASL
I am trying to connect to remote hive using pyHive.
conn = hive.Connection(host='*********', port=10001, database='default', username='********',
auth='KERBEROS', kerberos_service_name='hive').cursor()
But I always got this…

Anthony Lauly
- 319
- 2
- 5
- 17
1
vote
0 answers
Getting error "UnicodeError: label too long" while trying to connect Hive from python
I am trying to connect Hive DB using below host and user creds.
Getting error "UnicodeError: label too long". Is there a way I can overcome this issue?
I tried below script:
from pyhive import hive
import re, os, time
import pandas as pd
import…

Dilip
- 55
- 1
- 1
- 8